diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index b3db86b9..62238a97 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -61,7 +61,6 @@ packages/ ├── plugins/ @red-codes/plugins — Plugin ecosystem (discovery, registry, validation, sandboxing) ├── renderers/ @red-codes/renderers — Renderer plugin system (registry, TUI renderer) ├── sdk/ @red-codes/sdk — Agent SDK for programmatic governance integration -├── swarm/ @red-codes/swarm — Shareable agent swarm templates ├── scheduler/ @red-codes/scheduler — Task scheduler, queue, lease manager, and worker orchestration └── telemetry-client/ @red-codes/telemetry-client — Telemetry client (identity, signing, queue, sender) diff --git a/CLAUDE.md b/CLAUDE.md index 93fc0c89..42c44fe6 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -140,12 +140,6 @@ packages/ │ └── types.ts # Storage type definitions ├── telemetry/src/ # @red-codes/telemetry — Runtime telemetry and logging ├── telemetry-client/src/ # @red-codes/telemetry-client — Telemetry client (identity, signing, queue, sender) -├── swarm/src/ # @red-codes/swarm — Shareable agent swarm templates -│ ├── config.ts # Swarm configuration -│ ├── manifest.ts # Swarm manifest parsing -│ ├── scaffolder.ts # Swarm scaffolding -│ ├── types.ts # Swarm type definitions -│ └── index.ts # Module re-exports ├── sdk/src/ # @red-codes/sdk — Agent SDK for programmatic governance │ ├── sdk.ts # SDK implementation │ ├── session.ts # Session management @@ -254,8 +248,7 @@ Each workspace package maps to a single architectural concept: - **packages/telemetry/** — Runtime telemetry and logging - **packages/telemetry-client/** — Telemetry client (identity, signing, queue, sender) - **packages/sdk/** — Agent SDK for programmatic governance integration -- **packages/swarm/** — Shareable agent swarm templates (config, manifest, scaffolder) -- **packages/scheduler/** — Task scheduler, queue, lease manager, and worker orchestration for swarm +- **packages/scheduler/** — Task scheduler, queue, lease manager, and worker orchestration - **apps/cli/** — CLI entry point and commands (published as `@red-codes/agentguard`) - **packages/invariant-data-protection/** — Data protection invariant plugin - **apps/mcp-server/** — MCP governance server (15 governance tools) @@ -379,7 +372,7 @@ pnpm test --filter=@red-codes/kernel # Test a single package **Test structure:** - **Vitest workspace** (`vitest.workspace.ts`): orchestrates tests across all packages - **TypeScript tests** (distributed across `packages/*/tests/` and `apps/*/tests/`): vitest -- **Coverage areas**: adapters (file, git, shell, claude-code, copilot-cli, hook integrity), kernel (AAB, engine, monitor, blast radius, heartbeat, integration, e2e pipeline, conformance, tiers, intent drift, enforcement audit, interventions), CLI commands (args, guard, inspect, init, simulate, ci-check, claude-hook, claude-init, export/import, policy-validate, policy-verify, diff, evidence-pr, traces, plugin, auto-setup, config, demo, migrate), decision records, domain models, events, evidence packs (explainable, explanation chain), evidence summary, execution log, export-import roundtrip, impact forecast, invariants, matchers (path-matcher, command-scanner, policy-matcher, benchmark), notification formatter, plugins (discovery, registry, sandbox, validation), policy evaluation (including composer, pack loader, policy packs, evaluation trace, forecast conditions, gate conditions, persona, trust, pack versioning), renderers, replay (engine, comparator, processor), simulation (filesystem, git, package, dependency graph), SQLite storage (migrations, session, sink, store, cross-run, factory, aggregation queries, commands), swarm (scaffolder, config, manifest), telemetry (event queue, event sender, anonymize, cloud sink, event mapper), TUI renderer, violation mapper, VS Code event reader, YAML loading +- **Coverage areas**: adapters (file, git, shell, claude-code, copilot-cli, hook integrity), kernel (AAB, engine, monitor, blast radius, heartbeat, integration, e2e pipeline, conformance, tiers, intent drift, enforcement audit, interventions), CLI commands (args, guard, inspect, init, simulate, ci-check, claude-hook, claude-init, export/import, policy-validate, policy-verify, diff, evidence-pr, traces, plugin, auto-setup, config, demo, migrate), decision records, domain models, events, evidence packs (explainable, explanation chain), evidence summary, execution log, export-import roundtrip, impact forecast, invariants, matchers (path-matcher, command-scanner, policy-matcher, benchmark), notification formatter, plugins (discovery, registry, sandbox, validation), policy evaluation (including composer, pack loader, policy packs, evaluation trace, forecast conditions, gate conditions, persona, trust, pack versioning), renderers, replay (engine, comparator, processor), simulation (filesystem, git, package, dependency graph), SQLite storage (migrations, session, sink, store, cross-run, factory, aggregation queries, commands), telemetry (event queue, event sender, anonymize, cloud sink, event mapper), TUI renderer, violation mapper, VS Code event reader, YAML loading ## CI/CD & Automation diff --git a/README.md b/README.md index 927db6f8..4c312361 100644 --- a/README.md +++ b/README.md @@ -20,11 +20,7 @@ Install in 30 seconds. Your agents can't break what matters.

AI coding agents (Claude Code, Codex CLI, GitHub Copilot CLI, Google Gemini CLI, OpenCode, Goose, and more) run autonomously — writing files, executing commands, pushing code. AgentGuard prevents them from doing catastrophic things: no accidental pushes to main, no credential leaks, no runaway destructive loops. 26 built-in safety checks, zero config required. **For individuals:** stop your AI from wrecking your machine or repo. -**For teams:** run fleets of agents safely at scale, with audit trails that pass compliance. - -> **See it live** — We run 100+ autonomous AI agents building AgentGuard itself, governed 24/7. -> Every deny, every escalation, every code review — visible in real time. -> **[Watch the live swarm →](https://agentguard-cloud-office-sim.vercel.app)** +**For teams:** govern agents at scale, with audit trails that pass compliance. ## What Problem Does AgentGuard Solve? @@ -121,7 +117,6 @@ agentguard cloud login # Opens browser → authenticate → CLI | Link | Description | |------|-------------| -| **[Live Office](https://agentguard-cloud-office-sim.vercel.app)** | **Watch our 100+ agent swarm build software 24/7** — real-time governance visualization | | [Dashboard](https://agentguard-cloud-dashboard.vercel.app) | Team dashboard — runs, violations, analytics | ## Agent Identity @@ -144,7 +139,6 @@ Identity consists of a **role** (`developer`, `reviewer`, `ops`, `security`, `ci | **48 event kinds** | Full lifecycle telemetry: `ActionRequested → ActionAllowed/Denied → ActionExecuted` | | **Real-time cloud dashboard** | Telemetry streams to your team dashboard; opt-in, anonymous by default | | **Multi-tenant** | Team workspaces, GitHub/Google OAuth, SSO-ready | -| **Live Office visualization** | [24/7 live view](https://agentguard-cloud-office-sim.vercel.app) of our actual 100+ agent swarm — watch AI build software under governance | | **Agent SDK** | Programmatic governance for custom integrations and RunManifest-driven workflows | | **Agent identity** | Declare agent role + driver for governance telemetry — automatic prompt or CLI flag | | **Pre-push hooks** | Branch protection enforcement via git pre-push hooks, configured from agentguard.yaml | @@ -357,7 +351,7 @@ rules: | `recursive-operation-guard` | Low | `find -exec`, `xargs` with write/delete | | `lockfile-integrity` | Low | `package.json` changes without lockfile sync | | `no-verify-bypass` | High | `git push/commit --no-verify` — prevents skipping pre-push/pre-commit hooks | -| `no-self-approve-pr` | Critical | Agents merging or approving PRs they authored — enforces separation of duties in multi-agent swarms | +| `no-self-approve-pr` | Critical | Agents merging or approving PRs they authored — enforces separation of duties in multi-agent setups | | `cross-repo-blast-radius` | High | Caps cumulative unique files written across all repos in a session (default: 50 files) | ## Architecture @@ -539,7 +533,6 @@ agentguard cloud login # Connect after you have an API key | Resource | URL | |----------|-----| | Dashboard | [agentguard-cloud-dashboard.vercel.app](https://agentguard-cloud-dashboard.vercel.app) | -| **Live Office** | **[agentguard-cloud-office-sim.vercel.app](https://agentguard-cloud-office-sim.vercel.app)** — watch our swarm build software 24/7 | | Website | [agentguardhq.github.io/agentguard](https://agentguardhq.github.io/agentguard/) | | Docs | [docs/](docs/) | | Architecture | [docs/unified-architecture.md](docs/unified-architecture.md) | @@ -557,7 +550,7 @@ agentguard cloud login # Connect after you have an API key | [Octi Pulpo](https://github.com/AgentGuardHQ/octi-pulpo) | Coordination — pipeline controller, model routing | | [ShellForge](https://github.com/AgentGuardHQ/shellforge) | Orchestration — multi-runtime agent execution | | [Preflight](https://github.com/AgentGuardHQ/preflight) | Protocol — universal design-before-you-build standard | -| [Extensions](https://github.com/AgentGuardHQ/agentguard-extensions) | Drivers, integrations, policies, example swarms | +| [Extensions](https://github.com/AgentGuardHQ/agentguard-extensions) | Drivers, integrations, policies | ## License diff --git a/ROADMAP.md b/ROADMAP.md index 8fbddaa1..da20f7c4 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -12,7 +12,7 @@ AgentGuard is the **Execution Control Plane for autonomous AI agents** — the independent governance layer that sits between autonomous agents and the real world. All agent side effects must pass through deterministic governance before reaching the environment, regardless of which orchestration framework, cloud provider, or AI model powers the agents. -**Strategic positioning**: Autonomous Execution Governance (AEG). Like Okta for the application layer, AgentGuard controls the trust boundary without replacing the underlying systems. The OSS repo houses Layer 1 (Kernel — the moat) and Layer 2 (Studio Runtime — adapters, swarm templates, execution profiles, and the `agentguard init studio` wizard that bootstraps governed workspaces). +**Strategic positioning**: Autonomous Execution Governance (AEG). Like Okta for the application layer, AgentGuard controls the trust boundary without replacing the underlying systems. The OSS repo houses Layer 1 (Kernel — the moat) and Layer 2 (Studio Runtime — adapters, execution profiles, and the `agentguard init studio` wizard that bootstraps governed workspaces). **Core thesis**: Once autonomous agents start modifying production systems, organizations need deterministic execution governance. Prompt alignment cannot solve this. Only a reference monitor architecture — default-deny, tamper-evident, fully auditable — provides the guarantees enterprises require. Orchestration is commoditizing (LangGraph, CrewAI, AutoGen, platform-level tools); governance remains scarce. @@ -41,7 +41,7 @@ AgentGuard is the **Execution Control Plane for autonomous AI agents** — the i | MCP governance server (15 tools) | Implemented | Production | | Plugin ecosystem (discovery, registry, sandboxing) | Implemented | Production | | 8 policy packs (essentials, strict, ci-safe, enterprise, open-source, soc2, hipaa, eng-standards) | Implemented | Production | -| 26-agent autonomous swarm templates | Implemented | Production | +| Multi-agent governance templates | Implemented | Production | | KE-1 Structured matchers (Aho-Corasick, globs, reason codes) | **Shipped v2.3.0** | `packages/matchers/` | | All 46 event kinds mapped to cloud AgentEvent | **Shipped v2.3.0** | `packages/telemetry/src/event-mapper.ts` | | Agent SDK for programmatic governance | **Shipped v2.3.0** | Programmatic governance integration | @@ -61,7 +61,7 @@ AgentGuard is the **Execution Control Plane for autonomous AI agents** — the i | No-verify-bypass invariant (#24) — blocks `git push/commit --no-verify` | **Shipped v2.6.0** | `packages/invariants/src/definitions.ts` | | Read-only operations permitted on protected paths | **Shipped v2.7.0** | `packages/adapters/src/file.ts` (closes #648) | | Install attribution telemetry — opt-in postinstall ping (version, OS, Node, CI env, anon ID) | **Shipped v2.7.0** | `apps/cli/src/postinstall.ts` (PR #991) | -| `agentguard init studio` wizard, execution profiles, swarm template schema | **Shipped v2.7.0** | `apps/cli/src/commands/init.ts`, `packages/swarm/` (PR #987) | +| `agentguard init studio` wizard + execution profiles | **Shipped v2.7.0** | `apps/cli/src/commands/init.ts` (PR #987) | | OpenCode driver support | **Shipped v2.7.0** | Agent driver registry (PR #1019) | | Codex CLI adapter (PreToolUse/PostToolUse hook commands) | **Shipped v2.8.0** | `packages/adapters/src/codex-cli.ts` (PR #1024) | | Gemini CLI adapter (BeforeTool/AfterTool hook commands) | **Shipped v2.8.0** | `packages/adapters/src/gemini-cli.ts` (PR #1024) | @@ -197,7 +197,7 @@ This sprint implements the architectural upgrades required for AgentGuard to fun **Traction note (2026-03-24)**: npm reports ~1,761 weekly downloads, but investigation shows the majority are internal Vercel CI builds of `agentguard-cloud` which pins `@red-codes/agentguard@2.0.0`. Each Vercel build (ephemeral containers, preview deploys, branch builds) triggers a fresh `npm install`. Real external adoption is likely in the low hundreds. This makes install attribution tracking and the user capture funnel critical — without them, we cannot distinguish real adoption from CI noise. The version drift (cloud at 2.0.0 vs OSS at 2.4.0) should also be resolved. -**Release cadence**: v3.0 (KE-2 ActionContext + stranger test + capture funnel), v3.1 (Runner + `apps/runner`), v3.2+ (advanced integrations). Note: `agentguard init studio` wizard, execution profiles, swarm template schema, and install attribution all shipped early in v2.7.x ahead of schedule; Codex CLI + Gemini CLI + DeepAgents adapters shipped in v2.8.x (latest: v2.8.4). +**Release cadence**: v3.0 (KE-2 ActionContext + stranger test + capture funnel), v3.1 (Runner + `apps/runner`), v3.2+ (advanced integrations). Note: `agentguard init studio` wizard, execution profiles, and install attribution all shipped early in v2.7.x ahead of schedule; Codex CLI + Gemini CLI + DeepAgents adapters shipped in v2.8.x (latest: v2.8.4). ### Next — Pull-Based Runner (Phase 6.5 — `apps/runner`) @@ -225,9 +225,8 @@ Depends on: v3.0 released + Cloud Phase 2A (orchestrator + runner protocol). Shipped ahead of schedule in v2.7.x; dependency on v3.0 stranger test waived for early delivery. -- [x] ~~**`agentguard init studio` wizard**~~ — ✅ Done 2026-03-26 — detects project type (monorepo/single), CI/CD, test framework, agent runtimes; offers execution profile + swarm preset selection (full/qa-focused/dev-ops/minimal); `--non-interactive` mode for CI; optional Cloud connection (PR #987) +- [x] ~~**`agentguard init studio` wizard**~~ — ✅ Done 2026-03-26 — detects project type (monorepo/single), CI/CD, test framework, agent runtimes; offers execution profile selection (full/qa-focused/dev-ops/minimal); `--non-interactive` mode for CI; optional Cloud connection (PR #987) - [x] ~~**Execution profiles**~~ — ✅ Done 2026-03-26 — `ci-safe` and `enterprise` profiles shipped; 6 profiles total via `agentguard init --profile ` (PR #987) -- [x] ~~**Swarm template schema**~~ — ✅ Done 2026-03-26 — canonical JSON schema for swarm manifest, squad manifest, swarm config with zero-dependency runtime validator (PR #987) ### Next — Capability-Scoped Sessions (Phase 7) diff --git a/apps/cli/README.md b/apps/cli/README.md index 6792de76..dc76b73a 100644 --- a/apps/cli/README.md +++ b/apps/cli/README.md @@ -79,7 +79,7 @@ Every governance session requires an agent identity. Resolution order: If no identity is set, PreToolUse hooks **block all actions** with a message directing the agent to identify itself. The `.agentguard-identity` file is session-scoped and gitignored — it is blanked on session start/stop to prevent stale values. -For autonomous agent swarms, pass identity via env var per-process or `--agent-name` flag: +For multi-agent setups, pass identity via env var per-process or `--agent-name` flag: ```bash aguard guard --agent-name "builder-agent-3" --policy agentguard.yaml @@ -217,16 +217,6 @@ aguard status # Confirms: ⚡ Token optimization active Works with git, npm, cargo, tsc, docker, kubectl, and more. No configuration needed — AgentGuard detects RTK automatically. -## Agent Swarm - -AgentGuard ships with a 26-agent autonomous development swarm: - -```bash -aguard init swarm # Scaffolds agents, skills, and governance into your repo -``` - -Agents handle implementation, code review, CI triage, security audits, planning, docs, and more — all under governance. - ## Links - [GitHub](https://github.com/AgentGuardHQ/agentguard) diff --git a/apps/cli/package.json b/apps/cli/package.json index 131c9f66..a0346c6d 100644 --- a/apps/cli/package.json +++ b/apps/cli/package.json @@ -57,7 +57,6 @@ "@red-codes/policy": "workspace:*", "@red-codes/renderers": "workspace:*", "@red-codes/storage": "workspace:*", - "@red-codes/swarm": "workspace:*", "@red-codes/telemetry": "workspace:*", "@red-codes/telemetry-client": "workspace:*", "@types/better-sqlite3": "^7.6.0", diff --git a/apps/cli/src/commands/init.ts b/apps/cli/src/commands/init.ts index 4c92b86a..c99f5de6 100644 --- a/apps/cli/src/commands/init.ts +++ b/apps/cli/src/commands/init.ts @@ -72,11 +72,6 @@ export async function init(args: string[]): Promise { return initFirestore(dir); } - // Swarm scaffolding mode - if (extensionType === 'swarm') { - return initSwarm(parsed); - } - // Studio wizard — interactive project bootstrap if (extensionType === 'studio') { return initStudio(parsed); @@ -1287,122 +1282,6 @@ AGENTGUARD_STORE=firestore return 0; } -/** - * Scaffold the agent swarm: copy skill templates, render config, and output - * scheduled task definitions for registration. - */ -async function initSwarm(parsed: ReturnType): Promise { - const dir = parsed.flags.dir as string | undefined; - const force = parsed.flags.force === true || parsed.flags.force === 'true'; - const tiersFlag = parsed.flags.tiers as string | undefined; - const tiers = tiersFlag ? tiersFlag.split(',').map((t) => t.trim()) : undefined; - const projectRoot = resolve(dir ?? '.'); - - let scaffoldFn: typeof import('@red-codes/swarm').scaffold; - try { - const swarmModule = await import('@red-codes/swarm'); - scaffoldFn = swarmModule.scaffold; - } catch { - console.error(`\n ${color('Error', 'red')}: @red-codes/swarm package not found.`); - console.error(` Install it with: pnpm add @red-codes/swarm\n`); - return 1; - } - - const result = scaffoldFn({ projectRoot, force, tiers }); - - console.log( - `\n ${color('✓', 'green')} Swarm initialized (${bold(String(result.agents.length))} agents, ${bold(String(result.skillsWritten + result.skillsSkipped))} skills)\n` - ); - - if (result.configWritten) { - console.log( - ` ${dim('Created')} agentguard-swarm.yaml ${dim('(customize schedules, paths, labels)')}` - ); - } - - console.log( - ` ${dim('Skills written:')} ${result.skillsWritten} ${dim('Skipped (existing):')} ${result.skillsSkipped}\n` - ); - - // Print agent table - console.log( - ` ${bold('Agent')}${' '.repeat(28)}${bold('Tier')}${' '.repeat(8)}${bold('Schedule')}` - ); - console.log(` ${'─'.repeat(65)}`); - for (const agent of result.agents) { - const name = agent.name.padEnd(33); - const tier = agent.tier.padEnd(12); - console.log(` ${name}${tier}${agent.cron}`); - } - - console.log(`\n ${bold('Next steps:')}`); - console.log(` ${dim('# Register scheduled tasks (run inside Claude Code):')}`); - console.log( - ` ${dim('# The agent prompts are in .claude/skills/ — use them with the scheduled tasks API')}` - ); - console.log( - ` ${dim('# Or use the register-swarm-tasks skill to auto-register all agents')}\n` - ); - - // Write a register-swarm-tasks skill - const registerSkillPath = join(projectRoot, '.claude', 'skills', 'register-swarm-tasks.md'); - if (!existsSync(registerSkillPath) || force) { - const registerContent = buildRegisterSkill(result); - mkdirSync(join(projectRoot, '.claude', 'skills'), { recursive: true }); - writeFileSync(registerSkillPath, registerContent, 'utf8'); - console.log(` ${dim('Created')} .claude/skills/register-swarm-tasks.md\n`); - } - - return 0; -} - -function buildRegisterSkill(result: { - agents: ReadonlyArray<{ - id: string; - name: string; - tier: string; - cron: string; - description: string; - prompt: string; - }>; -}): string { - const lines = [ - '# Skill: Register Swarm Tasks', - '', - 'Register all swarm agents as scheduled tasks. Run this once after `agentguard init swarm`.', - '', - '## Autonomy Directive', - '', - 'This skill runs interactively. Confirm with the user before creating tasks.', - '', - '## Steps', - '', - '### 1. Create Scheduled Tasks', - '', - 'Use the `mcp__scheduled-tasks__create_scheduled_task` tool to register each agent:', - '', - ]; - - for (const agent of result.agents) { - lines.push(`#### ${agent.name}`); - lines.push(''); - lines.push(`- **Task ID**: \`${agent.id}\``); - lines.push(`- **Cron**: \`${agent.cron}\``); - lines.push(`- **Description**: ${agent.description}`); - lines.push(`- **Prompt**: Use the content from the \`${agent.id}\` prompt template`); - lines.push(''); - } - - lines.push('### 2. Verify'); - lines.push(''); - lines.push( - 'After creating all tasks, use `mcp__scheduled-tasks__list_scheduled_tasks` to verify they are registered.' - ); - lines.push(''); - - return lines.join('\n'); -} - // --------------------------------------------------------------------------- // Studio wizard — interactive project bootstrap // --------------------------------------------------------------------------- @@ -1484,13 +1363,6 @@ const PROFILE_DESCRIPTIONS: Record = { permissive: 'Minimal — only block the most dangerous operations', }; -const SWARM_PRESETS: Record = { - full: ['core', 'governance', 'ops', 'quality', 'marketing'], - 'qa-focused': ['core', 'quality'], - 'dev-ops': ['core', 'governance', 'ops'], - minimal: ['core'], -}; - async function studioPrompt(question: string, options: string[], defaultIdx = 0): Promise { if (!process.stdin.isTTY) return defaultIdx; @@ -1571,30 +1443,7 @@ async function initStudio(parsed: ReturnType): Promise } const selectedProfile = profileNames[profileIdx]; - // Step 3: Select swarm preset - const presetNames = Object.keys(SWARM_PRESETS); - const presetOptions = presetNames.map( - (p) => `${bold(p)} — tiers: ${SWARM_PRESETS[p].join(', ')}` - ); - - let defaultPresetIdx = 0; // full - if (!detection.isMonorepo) { - defaultPresetIdx = presetNames.indexOf('minimal'); - } - - let swarmPresetIdx: number; - let includeSwarm: boolean; - if (nonInteractive) { - includeSwarm = true; - swarmPresetIdx = defaultPresetIdx; - } else { - includeSwarm = await studioConfirm(`\n ${bold('Include agent swarm?')}`, true); - swarmPresetIdx = includeSwarm - ? await studioPrompt('Select swarm preset:', presetOptions, defaultPresetIdx) - : 0; - } - - // Step 4: Scaffold execution profile + // Step 3: Scaffold execution profile mkdirSync(projectRoot, { recursive: true }); const templatesDir = resolveTemplatesDir(); const templatePath = join(templatesDir, `${selectedProfile}.yaml`); @@ -1626,40 +1475,7 @@ async function initStudio(parsed: ReturnType): Promise ); } - // Step 5: Scaffold swarm (if selected) - if (includeSwarm) { - const selectedPreset = presetNames[swarmPresetIdx]; - const tiers = SWARM_PRESETS[selectedPreset]; - - let scaffoldFn: typeof import('@red-codes/swarm').scaffold; - try { - const swarmModule = await import('@red-codes/swarm'); - scaffoldFn = swarmModule.scaffold; - } catch { - console.error( - `\n ${color('⚠', 'yellow')} @red-codes/swarm not found — skipping swarm scaffold` - ); - scaffoldFn = null as unknown as typeof import('@red-codes/swarm').scaffold; - } - - if (scaffoldFn) { - try { - const result = scaffoldFn({ projectRoot, tiers }); - - console.log( - ` ${color('✓', 'green')} Swarm: ${bold(selectedPreset)} preset (${result.agents.length} agents, ${tiers.join(', ')})` - ); - console.log( - ` ${dim('Skills written:')} ${result.skillsWritten} ${dim('Skipped:')} ${result.skillsSkipped}` - ); - } catch (err) { - const msg = err instanceof Error ? err.message : String(err); - console.error(` ${color('⚠', 'yellow')} Swarm scaffold failed: ${msg}`); - } - } - } - - // Step 6: Set up Claude Code hooks (if Claude Code detected) + // Step 4: Set up Claude Code hooks (if Claude Code detected) if (detection.agentRuntimes.includes('Claude Code')) { const setupHooks = nonInteractive || @@ -1690,12 +1506,6 @@ async function initStudio(parsed: ReturnType): Promise console.log(`\n ${color('✓', 'green')} ${bold('Studio initialized')}\n`); console.log(` ${bold('What was set up:')}`); console.log(` ${dim('Policy:')} ${selectedProfile} execution profile`); - if (includeSwarm) { - const selectedPreset = presetNames[swarmPresetIdx]; - console.log( - ` ${dim('Swarm:')} ${selectedPreset} preset (${SWARM_PRESETS[selectedPreset].join(', ')})` - ); - } if (detection.agentRuntimes.includes('Claude Code')) { console.log(` ${dim('Hooks:')} Claude Code governance hooks`); } @@ -1703,10 +1513,6 @@ async function initStudio(parsed: ReturnType): Promise console.log(`\n ${bold('Next steps:')}`); console.log(` ${dim('#')} Review and customize agentguard.yaml`); console.log(` aguard guard --dry-run`); - if (includeSwarm) { - console.log(` ${dim('#')} Register swarm tasks in Claude Code`); - console.log(` ${dim('#')} See .claude/skills/register-swarm-tasks.md`); - } console.log(''); return 0; @@ -1714,7 +1520,7 @@ async function initStudio(parsed: ReturnType): Promise function printInitHelp(): void { console.log(` - ${bold('agentguard init')} — Scaffold a new governance extension, policy template, or agent swarm + ${bold('agentguard init')} — Scaffold a new governance extension or policy template ${bold('Usage:')} agentguard init --extension [--name ] [--dir ] @@ -1730,10 +1536,7 @@ function printInitHelp(): void { simulator Custom action simulator ${bold('Studio wizard:')} - studio Interactive workspace bootstrap — detect project, select profile + swarm - - ${bold('Agent swarm:')} - swarm Scaffold the full agent swarm (skills, config, task definitions) + studio Interactive workspace bootstrap — detect project, select profile ${bold('Storage backends:')} firestore Set up Firestore backend (security rules + credentials guide) @@ -1751,8 +1554,6 @@ function printInitHelp(): void { --template, -t Policy template name (creates agentguard.yaml) --name, -n Extension name (default: my-) --dir, -d Output directory (default: ./ or . for templates) - --tiers Comma-separated tiers for swarm (core,governance,ops,quality,marketing) - --force Overwrite existing skill files during swarm init --non-interactive Skip prompts, use detected defaults (for CI/scripting) ${bold('Examples:')} @@ -1765,8 +1566,5 @@ function printInitHelp(): void { agentguard init firestore agentguard init studio agentguard init studio --non-interactive - agentguard init swarm - agentguard init swarm --tiers core,governance - agentguard init swarm --force `); } diff --git a/apps/cli/tests/cli-init.test.ts b/apps/cli/tests/cli-init.test.ts index 4625a933..f902f79f 100644 --- a/apps/cli/tests/cli-init.test.ts +++ b/apps/cli/tests/cli-init.test.ts @@ -10,19 +10,13 @@ vi.mock('node:fs', () => ({ readFileSync: vi.fn(), })); -vi.mock('@red-codes/swarm', () => ({ - scaffold: vi.fn(), -})); - import { init } from '../src/commands/init.js'; -import { scaffold } from '@red-codes/swarm'; beforeEach(() => { vi.clearAllMocks(); vi.spyOn(console, 'log').mockImplementation(() => {}); vi.spyOn(console, 'error').mockImplementation(() => {}); vi.mocked(existsSync).mockReturnValue(false); - vi.mocked(scaffold).mockReturnValue({ agents: [{}, {}, {}], skillsWritten: 3, skillsSkipped: 0 }); }); afterEach(() => { @@ -565,38 +559,6 @@ describe('init command', () => { expect(output).toContain('ci-safe'); }); - it('should scaffold swarm by default in non-interactive mode', async () => { - setupStudioMocks(); - await init(['studio', '--non-interactive']); - expect(vi.mocked(scaffold)).toHaveBeenCalled(); - }); - - it('should use minimal swarm preset for a non-monorepo project', async () => { - setupStudioMocks(); - await init(['studio', '--non-interactive']); - const call = vi.mocked(scaffold).mock.calls[0]; - expect(call?.[0]?.tiers).toEqual(['core']); - }); - - it('should use full swarm preset for a monorepo', async () => { - setupStudioMocks({ isMonorepo: true }); - await init(['studio', '--non-interactive']); - const call = vi.mocked(scaffold).mock.calls[0]; - expect(call?.[0]?.tiers).toEqual( - expect.arrayContaining(['core', 'governance', 'ops', 'quality', 'marketing']) - ); - }); - - it('should handle swarm scaffold failure gracefully and still return 0', async () => { - setupStudioMocks(); - vi.mocked(scaffold).mockImplementation(() => { - throw new Error('disk full'); - }); - const code = await init(['studio', '--non-interactive']); - expect(code).toBe(0); - expect(console.error).toHaveBeenCalledWith(expect.stringContaining('disk full')); - }); - it('should overwrite existing agentguard.yaml in non-interactive mode', async () => { setupStudioMocks({ hasExistingYaml: true }); await init(['studio', '--non-interactive']); @@ -626,15 +588,6 @@ describe('init command', () => { expect(output).toContain('Studio initialized'); }); - it('should include swarm tier in summary when swarm was scaffolded', async () => { - setupStudioMocks(); - const consoleSpy = vi.spyOn(console, 'log').mockImplementation(() => {}); - await init(['studio', '--non-interactive']); - - const output = consoleSpy.mock.calls.flat().join('\n'); - expect(output).toContain('minimal'); - }); - it('should write files to custom --dir when provided', async () => { setupStudioMocks(); await init(['studio', '--non-interactive', '--dir', '/tmp/studio-test-dir']); diff --git a/docs/agent-sdlc-architecture.md b/docs/agent-sdlc-architecture.md deleted file mode 100644 index 89aa5079..00000000 --- a/docs/agent-sdlc-architecture.md +++ /dev/null @@ -1,266 +0,0 @@ -# Deterministic Action Mediation for Agent-Native Software Engineering - -## Abstract - -As AI agents transition from passive code assistants to active system operators, the fundamental risk in software development shifts from incorrect reasoning to unsafe execution. Agent-generated actions may modify source code, execute infrastructure commands, or interact with external systems. - -Traditional guardrails embedded inside language models are probabilistic and insufficient for enforcing safe execution. - -This document proposes a deterministic architecture for Agent-Native Software Development Life Cycles (SDLC) based on three principles: - -1. Separation of reasoning and execution -2. Deterministic authorization boundaries for agent actions -3. Observable runtime telemetry for agent behavior - -The system introduces an Action Authorization Boundary (AAB) that mediates all agent actions and a runtime telemetry layer that records execution outcomes. - -Together these components create a controlled environment for safe agent-driven development. - -## System Architecture - -The architecture separates the AI reasoning layer from the execution environment. - -``` -Agent Reasoning Layer -(LLM planning, code generation) - │ - ▼ -Intent Compilation -(structured action proposals) - │ - ▼ -Action Authorization Boundary -(deterministic policy enforcement) - │ - ▼ -Execution Adapters -(filesystem, shell, CI, APIs) - │ - ▼ -Runtime Telemetry Layer -(event monitoring, audit, and replay) -``` - -This separation ensures that probabilistic reasoning never directly controls real-world execution. - -## Core Components - -### 1. Intent Layer - -The AI agent produces structured intent objects representing requested actions rather than raw commands. - -Example: - -```json -{ - "action": "file.write", - "target": "src/auth/session.ts", - "justification": "Fix token refresh logic" -} -``` - -Intent compilation converts natural language reasoning into canonical action representations. This normalization step is required for deterministic authorization. - -**Implementation:** The canonical event factory in [`domain/events.js`](../domain/events.js) provides the `createEvent(kind, data)` function that normalizes all system activity into structured, validated events with stable fingerprints. - -### 2. Action Authorization Boundary (AAB) - -The AAB is the system's enforcement core and acts as a reference monitor for agent actions. - -Responsibilities: - -- Canonicalize action requests -- Evaluate policy and capability constraints -- Allow or deny execution -- Record authorization decisions -- Emit execution events - -Example policy: - -``` -file.write: src/** -test.run: allowed -shell.exec: restricted -terraform.apply: denied -``` - -The AAB represents the smallest trusted component in the system and must remain minimal and auditable. - -**Implementation:** The AAB is specified in [`docs/agentguard.md`](agentguard.md) with a complete evaluation pipeline (parse → scope → policy → invariant → blast radius → decision). The action interception prototype exists in [`core/cli/claude-hook.js`](../core/cli/claude-hook.js) as a Claude Code PostToolUse hook. - -### 3. Execution Adapters - -Execution adapters translate approved actions into real system operations. - -Examples: - -- Filesystem adapter -- Shell adapter -- Test runner adapter -- CI adapter -- API adapter - -All operations must pass through the AAB before execution. Direct agent access to execution environments is prohibited. - -**Implementation:** The CLI adapter ([`core/cli/adapter.js`](../core/cli/adapter.js)) wraps child processes and intercepts stderr. The ingestion pipeline ([`domain/ingestion/pipeline.js`](../domain/ingestion/pipeline.js)) orchestrates parse → fingerprint → classify → map stages for all intercepted output. - -### 4. Runtime Telemetry Layer - -The telemetry layer acts as the observability and feedback system for agent execution. - -Instead of relying solely on logs, it records structured events that describe both agent actions and their consequences. - -Example events: - -``` -ActionRequested -ActionAllowed -ActionDenied -FileModified -TestFailed -InvariantViolation -``` - -The telemetry layer enables: - -- Execution replay -- Debugging timelines -- Anomaly detection -- Developer feedback loops - -**Implementation:** The universal EventBus ([`domain/event-bus.js`](../domain/event-bus.js)) provides pub/sub across Node.js and browser environments. The EventStore ([`domain/event-store.js`](../domain/event-store.js)) persists events with query, replay, and filtering capabilities. The JSONL sink persists all events for audit trail and post-session analysis. - -## Event Model - -All system activity is captured as immutable events. - -Example flow: - -``` -Agent proposes action - │ - ▼ -AAB evaluates policy - │ - ▼ -Action allowed - │ - ▼ -Execution adapter performs operation - │ - ▼ -Test failure occurs - │ - ▼ -Telemetry records event -``` - -This event stream forms a complete audit trail of agent activity. - -**Implementation:** The system defines 30 canonical event kinds across 6 categories (ingestion, battle lifecycle, progression, session, governance, developer signals) in [`domain/events.js`](../domain/events.js). Events are validated against schemas, assigned monotonic IDs, and fingerprinted for deduplication. - -## Security and Reliability Model - -The architecture adopts principles from high-assurance systems. - -**Reference Monitor.** All access to execution resources must pass through the AAB. - -**Minimal Trusted Computing Base.** The authorization boundary remains small and verifiable. - -**Capability-Based Permissions.** Agents receive granular execution rights rather than broad system access. - -**Immutable Telemetry.** All decisions and outcomes are recorded as events. - -## Academic Foundations - -This architecture draws from three established fields of computer science. Citing these foundations distinguishes the system from ad-hoc AI tooling and positions it within rigorous engineering traditions. - -### 1. Reference Monitors (Anderson, 1972) - -The Action Authorization Boundary implements the classical reference monitor concept from James P. Anderson's 1972 Computer Security Technology Planning Study. A reference monitor must satisfy three properties: - -1. **Complete mediation** — every access to a protected resource is checked -2. **Tamper-proof** — the monitor cannot be bypassed or modified by the subjects it governs -3. **Verifiable** — the monitor is small enough to be subject to analysis and testing - -The AAB satisfies these properties by design. All agent actions must pass through the boundary (complete mediation). The AAB operates as a deterministic runtime separate from the AI reasoning layer (tamper-proof — the agent cannot modify its own constraints). The policy evaluation logic is pure functions operating on data with no inference or heuristics (verifiable). - -Most AI guardrail systems fail the reference monitor test because they embed safety checks inside the probabilistic model itself. By externalizing enforcement into a deterministic boundary, this architecture achieves the formal properties that embedded guardrails cannot. - -> Anderson, J. P. (1972). *Computer Security Technology Planning Study*. ESD-TR-73-51, Vol. II. Air Force Electronic Systems Division. - -### 2. Capability-Based Security (Dennis & Van Horn, 1966) - -The policy model follows the capability-based security paradigm introduced by Jack Dennis and Earl Van Horn. In this model, agents do not receive ambient authority (broad permissions inherited from the user's environment). Instead, each agent receives an explicit capability set — a bounded collection of permissions that defines exactly what actions it may perform. - -```yaml -# Agent capability set (not ambient authority) -policy: - scope: - include: ["src/**", "tests/**"] - exclude: ["src/database/**"] - permissions: - file_edit: allow - file_delete: deny - git_push: deny -``` - -This is the Principle of Least Authority (POLA) applied to AI agents. The agent can only act within its declared capabilities, regardless of the broader permissions available to the user who invoked it. - -The capability model also enables composition. Multiple policies can apply to the same agent. If any policy denies an action, the action is denied (fail-closed composition). - -> Dennis, J. B., & Van Horn, E. C. (1966). Programming semantics for multiprogrammed computations. *Communications of the ACM*, 9(3), 143–155. - -### 3. Event Sourcing (Domain-Driven Design) - -The canonical event model follows the Event Sourcing pattern from domain-driven design. Rather than storing only the current state of the system, every state change is captured as an immutable event. The current state can be reconstructed by replaying the event stream. - -This provides three critical capabilities for agent-driven development: - -1. **Audit trail** — every agent action, policy decision, and execution outcome is recorded with full context -2. **Replay** — any sequence of events can be replayed to reproduce system behavior, enabling debugging and root cause analysis -3. **Temporal queries** — the system can answer questions about what happened at any point in time, not just what the current state is - -The event store supports filtering by kind, time range, and fingerprint, enabling both real-time monitoring and post-hoc analysis. - -Event sourcing transforms agent observability from "what is the system doing now" to "what has the system done, and why." This distinction is essential for building trust in autonomous agent behavior. - -> Vernon, V. (2013). *Implementing Domain-Driven Design*. Addison-Wesley. -> Young, G. (2010). CQRS Documents. https://cqrs.files.wordpress.com/2010/11/cqrs_documents.pdf - -## Advantages - -- Deterministic safety enforcement -- Observable agent execution -- Improved debugging workflows -- Human-interpretable agent behavior -- Compatibility with existing development tools - -This architecture enables organizations to safely integrate autonomous agents into software engineering workflows without sacrificing control or auditability. - -## Future Work - -Potential extensions include: - -- Automated policy synthesis from codebase analysis -- Invariant learning from repository history -- Agent debugging replay systems -- Multi-agent orchestration frameworks with shared governance -- AI-assisted root cause analysis from event streams - -These capabilities transform agent-driven development from experimental tooling into a structured and observable engineering discipline. - -## Summary - -Agent-native development requires a fundamental shift in system architecture. - -By separating reasoning from execution and introducing deterministic enforcement boundaries, it becomes possible to safely deploy AI agents in real engineering environments. - -The combination of Action Authorization Boundaries (AAB) and runtime telemetry systems provides the foundational infrastructure for this next generation of software development. - -## See Also - -- [AgentGuard Specification](agentguard.md) — detailed governance runtime design -- [Unified Architecture](unified-architecture.md) — full system architecture -- [Event Model](event-model.md) — canonical event schema and lifecycle -- [Architecture](../ARCHITECTURE.md) — system-level technical architecture diff --git a/docs/autonomous-sdlc-architecture.md b/docs/autonomous-sdlc-architecture.md deleted file mode 100644 index 56f833d0..00000000 --- a/docs/autonomous-sdlc-architecture.md +++ /dev/null @@ -1,1233 +0,0 @@ -# Autonomous SDLC Architecture - -> AgentGuard as a self-governing agent execution kernel: a capability-secured syscall runtime for autonomous software development. - -## 1. Architectural Thesis - -AgentGuard is a governed action runtime for AI coding agents. This document describes how it becomes a **self-governing autonomous SDLC testbed** — where AI agents develop AgentGuard itself, governed by AgentGuard's own runtime. - -### The Reflexive Property - -``` -AgentGuard - ↑ -developed by agents - ↑ -governed by AgentGuard itself -``` - -This is structurally identical to: -- Compilers compiling themselves -- Operating systems building themselves -- Kubernetes managing Kubernetes - -The system becomes a **live laboratory for agent safety**. Instead of theorizing about agent governance, we observe real agent behavior: failure modes, policy violations, unsafe tool usage, CI breakage patterns, drift between intent and execution. That produces empirical data, not theoretical models. - -### The OS Analogy - -In an operating system, programs cannot access hardware directly: - -``` -program → syscall → kernel → hardware -``` - -The kernel enforces permissions, memory safety, resource limits, and auditing. The same model applies to agents: - -``` -agent → syscall → AgentGuard kernel → system resources -``` - -Agents cannot directly access the filesystem, git, shell, or CI. Everything flows through AgentGuard's syscall interface. AgentGuard decides: **ALLOW**, **DENY**, or **REQUIRE_APPROVAL**. - -### Three-Layer Security Model - -Every syscall passes through three independent evaluation layers: - -``` -Layer 1: Capabilities → Can this agent even attempt this class of action? -Layer 2: Policies → Is this action allowed under current governance rules? -Layer 3: Invariants → Would this action violate system correctness constraints? -``` - -Each layer answers a different question. Each layer's decision is recorded separately in the audit trail. This separation prevents the system from collapsing into a single pile of allow/deny logic. - -**Default posture**: closed unless explicitly granted. No capability = no attempt possible. - -### Design Goals - -1. **Self-governing**: AgentGuard governs its own development. Every agent action on AgentGuard's codebase passes through AgentGuard's kernel. -2. **Syscall-mediated**: Agents interact with the system through 5 primitive operations. No direct access to filesystem, git, or shell. -3. **Capability-secured**: Agents possess specific, bounded, time-limited authority tokens. Default-deny, not default-allow. -4. **Minimal viable first**: 1 planner agent + 1 coder agent + governance runtime. No swarm until the narrow loop works. -5. **Experimentally grounded**: Every governance decision produces structured telemetry. Agent failure patterns become research data. - ---- - -## 2. Agent Syscall Interface - -### The 5 SDLC Primitives - -Everything agents do in a development lifecycle reduces to five operations: - -| Syscall | Purpose | Examples | -|---------|---------|---------| -| **`read_resource`** | Inspect system state | Read source files, view git diff, read issue descriptions, check test results | -| **`write_resource`** | Modify files | Write source code, edit tests, update configuration | -| **`run_task`** | Execute deterministic processes | Run tests, lint, build, type-check | -| **`create_artifact`** | Produce task outputs | Generate test results, coverage reports, lint reports | -| **`propose_change`** | Submit work for review | Create commits, open pull requests | - -Agents cannot perform anything outside this set. That gives deterministic governance over a small, auditable surface. - -### Mapping to AgentGuard's Action Types - -AgentGuard already defines 41 canonical action types across 10 classes (`packages/core/src/data/actions.json`). The 5 SDLC syscalls are a higher-level abstraction over these implementation-level types: - -| Syscall | AgentGuard Action Types | -|---------|------------------------| -| `read_resource` | `file.read`, `git.diff` | -| `write_resource` | `file.write`, `file.delete`, `file.move` | -| `run_task` | `test.run`, `test.run.unit`, `test.run.integration`, `npm.script.run` | -| `create_artifact` | `file.write` (to artifact output paths) | -| `propose_change` | `git.commit`, `git.branch.create` + external PR creation | - -The AAB (`src/kernel/aab.ts`) is the syscall router. It already normalizes Claude Code tool calls into canonical action types via `TOOL_ACTION_MAP`: - -``` -Claude Code Tool → AAB normalization → Action Type (syscall) -Write → normalizeIntent() → file.write (write_resource) -Edit → normalizeIntent() → file.write (write_resource) -Read → normalizeIntent() → file.read (read_resource) -Bash → detectGitAction() → git.* or shell.exec -Glob → normalizeIntent() → file.read (read_resource) -Grep → normalizeIntent() → file.read (read_resource) -``` - -For `Bash` tool calls, `detectGitAction()` further classifies git commands (e.g., `git push` → `git.push`, `git commit` → `git.commit`). - -### Syscall Wire Format - -Every syscall carries this structure (evolved from the Canonical Action Representation): - -```json -{ - "syscall": "write_resource", - "target": "src/kernel/monitor.ts", - "agent_id": "agent_dev_1a2b", - "capability_id": "cap_0192", - "payload": { - "content": "...", - "diff_lines": 42 - }, - "context": { - "task_id": "issue_42", - "role": "developer", - "run_id": "run_1709913400_abc", - "pipeline_stage": "implementation" - } -} -``` - -This maps to `RawAgentAction` (`src/kernel/aab.ts:17-27`) with context injected via `metadata`: - -```typescript -const raw: RawAgentAction = { - tool: 'Edit', - file: 'src/kernel/monitor.ts', - content: '...', - agent: 'agent_dev_1a2b', - metadata: { - role: 'developer', - taskId: 42, - capabilityId: 'cap_0192', - pipelineStage: 'implementation', - hook: 'PreToolUse', - }, -}; -``` - -### The Critical Rule - -**Agents must not be able to bypass the syscall interface.** - -This is enforced by registering AgentGuard as a **PreToolUse** hook for all Claude Code tools. The hook intercepts every tool call before execution and routes it through `kernel.propose()`. If the kernel denies the action, the tool call is blocked. - -Without PreToolUse enforcement, the capability model is advisory, not real. - ---- - -## 3. Capability Model - -### What a Capability Is - -A capability is a **signed grant of authority** to perform a bounded class of actions. Not role-based labels like "coder agent = can code." Instead, concrete, scoped, time-limited authority: - -```json -{ - "id": "cap_0192", - "subject": "agent_dev_1a2b", - "operation": "write_resource", - "scopes": [ - "repo://agent-guard/src/**", - "repo://agent-guard/tests/**" - ], - "constraints": { - "deny": [ - "repo://agent-guard/src/kernel/**", - "repo://agent-guard/src/policy/**", - "repo://agent-guard/src/invariants/**" - ], - "max_files": 20, - "max_diff_lines": 500 - }, - "issued_to": "agent_dev_1a2b", - "issued_by": "agentguard-scheduler", - "issued_at": "2026-03-09T10:00:00Z", - "expires_at": "2026-03-09T10:30:00Z", - "task_id": "issue_42" -} -``` - -This means: -- The agent **can** write files in `src/**` and `tests/**` -- The agent **cannot** write to `src/kernel/**`, `src/policy/**`, or `src/invariants/**` (self-modification protection) -- The agent **cannot** modify more than 20 files or 500 diff lines -- The authority **expires** after 30 minutes -- The authority is **scoped to a single task** - -### Why Capabilities, Not Just Policies - -Pure policy says: "any agent may ask, the system decides every time." The default is open-unless-denied. - -Capabilities say: "the agent can only even attempt actions for which it holds authority." The default is **closed-unless-explicitly-granted**. - -That is the correct default for autonomous systems. It changes the failure mode from "the system forgot to deny something" to "the system must explicitly grant everything." - -### The 5 Core Capabilities - -Each maps to one of the 5 syscalls: - -**A. Read Capability** -```json -{ - "operation": "read_resource", - "scopes": ["repo://agent-guard/src/**", "repo://agent-guard/docs/**", "artifact://test-results/**"] -} -``` - -**B. Write Capability** -```json -{ - "operation": "write_resource", - "scopes": ["repo://agent-guard/src/**", "repo://agent-guard/tests/**"], - "constraints": { - "deny": ["repo://agent-guard/src/kernel/**", "repo://agent-guard/src/policy/**"], - "max_files": 20 - } -} -``` - -**C. Task Capability** -```json -{ - "operation": "run_task", - "scopes": ["task://test", "task://lint", "task://build", "task://ts:check"] -} -``` - -**D. Artifact Capability** -```json -{ - "operation": "create_artifact", - "scopes": ["artifact://test-results/**", "artifact://coverage/**", "artifact://lint-report/**"] -} -``` - -**E. Change Proposal Capability** -```json -{ - "operation": "propose_change", - "scopes": ["branch://agent/issue-42/*"], - "constraints": { - "requires_artifacts": ["test-results", "coverage"], - "requires_tests_passing": true - } -} -``` - -### Capability Validation Flow - -For each syscall, the kernel validates in order: - -1. **Authentic?** — Is the token genuine (signature check)? -2. **Expired?** — Is the token still valid? -3. **Subject match?** — Does the token belong to this agent? -4. **Operation match?** — Does the token authorize this syscall type? -5. **Scope match?** — Does the target fall within the granted scopes? -6. **Constraints satisfied?** — Are constraint limits (max_files, deny patterns) met? - -Only after capability validation does the syscall proceed to Layer 2 (policies) and Layer 3 (invariants). - -### Roles as Capability Bundles - -Roles are an ergonomic layer. A "developer" role expands into a set of capability grants: - -``` -developer role → [ - read_resource(repo://**), - write_resource(repo://src/**, repo://tests/**), - run_task(task://test, task://lint, task://build), - create_artifact(artifact://test-results/**, artifact://coverage/**), - propose_change(branch://agent/*) -] -``` - -This keeps role-based thinking for humans while maintaining precise authority for governance. - -### Capability Lifecycle - -Capabilities are: -- **Short-lived**: Issued per task, expire when the task completes or times out -- **Task-scoped**: Each GitHub Issue gets a fresh authority envelope -- **Revocable**: The scheduler can revoke capabilities if escalation level rises -- **Non-transferable**: An agent cannot pass its capability to another agent (delegation requires the scheduler) - -### Delegation - -A planner agent does not directly issue capabilities. Instead, the scheduler observes the planner's output (file scope declarations, task assignments) and mints appropriately scoped capabilities for downstream agents: - -``` -Planner (via propose_change) → "Task: add rate limiting to monitor.ts" - → Scheduler reads planner output - → Scheduler mints capability for coder: - write_resource(src/kernel/monitor.ts, tests/ts/monitor.test.ts) - run_task(test, lint) - propose_change(branch://agent/issue-42) - → Coder receives bounded capability -``` - -This prevents agents from minting arbitrary authority. - ---- - -## 4. Three-Layer Security Model - -### Layer 1 — Capabilities: Authority - -**Question**: Can this agent even attempt this class of action? - -**Implementation**: Capability token validation (new — to be built). - -**Behavior**: Default-deny. If no valid capability exists for the requested syscall + scope, the action is immediately rejected before policy evaluation begins. - -**Example**: -``` -Agent: agent_qa_3c4d -Syscall: write_resource -Target: src/kernel/kernel.ts -Capability: write_resource(tests/**) - -Result: DENIED (target outside capability scope) -``` - -### Layer 2 — Policies: Governance - -**Question**: Is this action allowed under current governance rules? - -**Implementation**: Existing `PolicyRule` evaluation in `src/policy/evaluator.ts`. The `evaluate()` function matches actions against loaded policy rules, checking action patterns, scope conditions, branch conditions, and limits. - -**Behavior**: Rules can allow or deny. Deny rules from any source take priority (the evaluator checks denies first at `evaluator.ts:107`). - -**Example**: -``` -Agent: agent_dev_1a2b -Syscall: write_resource -Target: .github/workflows/ci.yml -Capability: write_resource(src/**, tests/**) — PASSES (different check) -Policy: deny file.write scope:[.github/**] — DENIED - -Result: DENIED (policy denial, reason: "CI config changes require human approval") -``` - -### Layer 3 — Invariants: Correctness - -**Question**: Would this action violate system correctness constraints? - -**Implementation**: Existing `DEFAULT_INVARIANTS` in `src/invariants/definitions.ts`. Six built-in invariants checked via `InvariantChecker`: - -1. **`no-secret-exposure`** (severity 5): No `.env`, credentials, `.pem`, `.key` files committed -2. **`protected-branch`** (severity 5): No direct push to main/master -3. **`blast-radius`** (severity 4): File modification count within limits -4. **`test-before-push`** (severity 3): Tests must pass before push -5. **`no-force-push`** (severity 5): No `git push --force` -6. **`lockfile-integrity`** (severity 3): Lock file consistency - -**Example**: -``` -Agent: agent_dev_1a2b -Syscall: propose_change -Target: branch://main -Capability: propose_change(branch://agent/*) — DENIED (scope mismatch) - -But even if capability passed: -Invariant: protected-branch — DENIED (direct push to main forbidden) -``` - -### Combined Evaluation Flow - -``` -Agent proposes tool call - ↓ -PreToolUse hook fires - ↓ -normalizeClaudeCodeAction() → RawAgentAction - ↓ -AAB.normalizeIntent() → NormalizedIntent (action type, target, destructive) - ↓ -┌─────────────────────────────────────────────────┐ -│ Layer 1: Capability Validation │ -│ Is capability token valid? │ -│ Does operation match? │ -│ Is target within scope? │ -│ Are constraints satisfied? │ -│ │ -│ → If DENIED: emit CapabilityDenied, return │ -└─────────────────────────┬───────────────────────┘ - ↓ -┌─────────────────────────────────────────────────┐ -│ Layer 2: Policy Evaluation │ -│ evaluate(intent, policies) → EvalResult │ -│ Match rules by action pattern, scope, branch │ -│ │ -│ → If DENIED: emit PolicyDenied, return │ -└─────────────────────────┬───────────────────────┘ - ↓ -┌─────────────────────────────────────────────────┐ -│ Layer 3: Invariant Checking │ -│ checkInvariants(systemState) → violations[] │ -│ Verify correctness constraints hold │ -│ │ -│ → If VIOLATED: emit InvariantViolation, return │ -└─────────────────────────┬───────────────────────┘ - ↓ -┌─────────────────────────────────────────────────┐ -│ Monitor: Escalation Check │ -│ Track denial rate, adjust escalation level │ -│ NORMAL → ELEVATED → HIGH → LOCKDOWN │ -│ LOCKDOWN = all actions denied │ -└─────────────────────────┬───────────────────────┘ - ↓ - EXECUTE action - ↓ - Emit lifecycle events to JSONL -``` - -### Audit Semantics - -Each layer's decision is recorded separately in the `GovernanceDecisionRecord`: - -```json -{ - "recordId": "dec_1709913600_a1b2", - "runId": "run_1709913400_abc", - "timestamp": 1709913600000, - "action": { - "type": "file.write", - "target": "src/kernel/monitor.ts", - "agent": "agent_dev_1a2b", - "destructive": false - }, - "capability": { - "id": "cap_0192", - "operation": "write_resource", - "scope_match": true, - "constraints_satisfied": true - }, - "policy": { - "matchedPolicyId": "sdlc-developer-policy", - "matchedPolicyName": "Developer Agent Policy", - "severity": 4, - "decision": "allow", - "reason": "Developer may modify source and tests" - }, - "invariants": { - "allHold": true, - "violations": [] - }, - "monitor": { - "escalationLevel": 0, - "totalEvaluations": 47, - "totalDenials": 2 - }, - "outcome": "allow", - "execution": { - "executed": true, - "success": true, - "durationMs": 12 - } -} -``` - -Now you can answer for any action: -- **What authority did the agent have?** → `capability` -- **Who issued it?** → `capability.issued_by` -- **Was it within scope?** → `capability.scope_match` -- **Which governance rule applied?** → `policy.matchedPolicyId` -- **Were correctness constraints maintained?** → `invariants.allHold` -- **What was the escalation level?** → `monitor.escalationLevel` - ---- - -## 5. System Topology - -``` -┌────────────────────────────────────────────────────────────────┐ -│ GitHub (Cloud) │ -│ │ -│ ┌──────────────┐ ┌───────────────┐ ┌─────────────────────┐ │ -│ │ Issues │ │ Pull Requests │ │ Actions Workflows │ │ -│ │ (Task Queue) │ │ (Agent Output) │ │ (Scheduler Trigger) │ │ -│ └──────┬───────┘ └───────▲───────┘ └──────────┬──────────┘ │ -└─────────┼──────────────────┼─────────────────────┼─────────────┘ - │ poll │ create PR │ trigger - ▼ │ ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ External Scheduler │ -│ │ -│ ┌──────────┐ ┌──────────────┐ ┌──────────┐ ┌────────────┐ │ -│ │ Issue │ │ Capability │ │ Worktree │ │ PR │ │ -│ │ Poller │→ │ Minter │→ │ Manager │→ │ Creator │ │ -│ └──────────┘ └──────┬───────┘ └──────────┘ └────────────┘ │ -│ │ mint caps + spawn │ -└────────────────────────┼────────────────────────────────────────┘ - ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ Agent Worktree (isolated git worktree) │ -│ │ -│ ┌──────────────────┐ │ -│ │ Claude Code Agent │ Every tool call is a syscall: │ -│ │ (capability-bound)│ │ -│ └────────┬─────────┘ │ -│ │ PreToolUse hook │ -│ ▼ │ -│ ┌──────────────────────────────────────────────────────────┐ │ -│ │ AgentGuard Kernel (syscall handler) │ │ -│ │ │ │ -│ │ RawAgentAction → AAB (syscall router) → NormalizedIntent │ │ -│ │ ↓ │ │ -│ │ Layer 1: Capability validation │ │ -│ │ ↓ │ │ -│ │ Layer 2: Policy evaluation │ │ -│ │ ↓ │ │ -│ │ Layer 3: Invariant checking │ │ -│ │ ↓ │ │ -│ │ Monitor (escalation) │ │ -│ │ ↓ │ │ -│ │ Execute or Deny → JSONL audit trail │ │ -│ └──────────────────────────────────────────────────────────┘ │ -└─────────────────────────────────────────────────────────────────┘ -``` - ---- - -## 6. Agent Roles as Capability Bundles - -### Role Definitions - -7 roles, each defined as a bundle of capabilities: - -| Role | Capability Bundle | -|------|------------------| -| **research** | `read_resource(repo://**)` | -| **product** | `read_resource(repo://**)` + `write_resource(docs/product/**, spec/**)` | -| **architect** | `read_resource(repo://**)` + `write_resource(docs/**, spec/**, *.md)` | -| **developer** | `read_resource(repo://**)` + `write_resource(src/**, tests/**, package.json)` + `run_task(test, lint, build, ts:check)` + `create_artifact(test-results, coverage)` + `propose_change(branch://agent/*)` | -| **qa** | `read_resource(repo://**)` + `write_resource(tests/**)` + `run_task(test, test:unit, test:integration, lint)` + `create_artifact(test-results, coverage, lint-report)` + `propose_change(branch://agent/*)` | -| **documentation** | `read_resource(repo://**)` + `write_resource(docs/**, *.md, examples/**)` + `propose_change(branch://agent/*)` | -| **auditor** | `read_resource(repo://**)` + `run_task(test, lint)` + `create_artifact(audit-report)` | - -### Self-Modification Protection - -When AgentGuard governs its own development, certain paths must be protected from agent modification: - -| Protected Path | Reason | -|---------------|--------| -| `src/kernel/**` | Core governance logic — agent modifications could weaken enforcement | -| `src/policy/**` | Policy engine — agents must not modify their own governance rules | -| `src/invariants/**` | Invariant definitions — agents must not weaken correctness constraints | -| `agentguard.yaml` | Default policy — changing this changes what agents can do | -| `.claude/settings.json` | Hook configuration — agents must not disable governance hooks | - -These paths are excluded from all write capabilities via `constraints.deny`. Only human commits can modify governance-critical code. - -### CLAUDE.md Role Templates - -Each agent receives a role-specific CLAUDE.md in its worktree: - -```markdown -# Agent Role: Developer -# Task: #{issue_number} — {issue_title} - -## Authority -You hold capabilities for: read_resource, write_resource, run_task, propose_change. -Your write scope is limited to: {allowed_paths} -Protected paths (will be denied): src/kernel/**, src/policy/**, src/invariants/** - -## Constraints -- Do NOT attempt to modify files outside your scope (the syscall will be denied) -- Do NOT push directly to any branch — commit to your worktree branch only -- Run `npm run ts:check` and `npm run ts:test` before committing -- Write tests for new functionality - -## Task Description -{issue_body} - -## Acceptance Criteria -{acceptance_criteria} -``` - ---- - -## 7. Task Lifecycle via GitHub Issues - -GitHub Issues serve as the task registry. State is encoded in labels. - -### Label Schema - -| Category | Labels | -|----------|--------| -| **Status** | `status:pending`, `status:assigned`, `status:in-progress`, `status:review`, `status:completed`, `status:failed` | -| **Type** | `task:implementation`, `task:test-generation`, `task:documentation`, `task:bug-fix`, `task:refactor`, `task:architecture`, `task:research`, `task:review` | -| **Priority** | `priority:critical`, `priority:high`, `priority:medium`, `priority:low` | -| **Role** | `role:developer`, `role:qa`, `role:architect`, `role:documentation`, `role:auditor` | -| **Retry** | `retry:0`, `retry:1`, `retry:2`, `retry:3` | -| **Governance** | `governance:clean`, `governance:violations`, `governance:lockdown` | - -### State Machine - -``` -┌─────────┐ ┌──────────┐ ┌─────────────┐ ┌──────────┐ -│ pending │──→│ assigned │──→│ in-progress │──→│ review │ -└─────────┘ └──────────┘ └──────┬──────┘ └────┬─────┘ - │ │ - ▼ ▼ - ┌──────────┐ ┌───────────┐ - │ failed │ │ completed │ - └────┬─────┘ └───────────┘ - │ - ▼ (if retries remain) - ┌─────────┐ - │ pending │ - └─────────┘ -``` - -### Issue Body Template - -```markdown -## Task Description -[What needs to be done] - -## Acceptance Criteria -- [ ] Criterion 1 -- [ ] Criterion 2 - -## File Scope -Allowed paths for this task: -- `src/adapters/**` -- `tests/ts/adapter-*.test.ts` - -## Protected Paths -These paths must NOT be modified (enforced by capabilities): -- `src/kernel/**` -- `src/policy/**` -- `src/invariants/**` - -## Dependencies -Depends on: #41, #39 - -## Branch -`agent/implementation/issue-42` - -## Priority -high - -## Max Retries -3 -``` - -### Assignment Comment - -```markdown -**AgentGuard Scheduler** assigned this task. - -- **Agent**: `agent_dev_1a2b` -- **Role**: developer -- **Capabilities**: read_resource(repo://), write_resource(src/**, tests/**), run_task(test, lint, build), propose_change(branch://agent/issue-42/*) -- **Protected**: src/kernel/**, src/policy/**, src/invariants/** -- **Max Actions**: 200 -- **Timeout**: 30m -- **Run ID**: `run_1709913400_abc` -``` - -### Completion Comment - -```markdown -**AgentGuard Scheduler** — task completed. - -- **PR**: #87 -- **Actions**: 142 proposed, 138 allowed, 4 denied -- **Capability denials**: 1 (attempted write to src/kernel/) -- **Policy denials**: 2 (scope violations) -- **Invariant violations**: 1 (blast radius exceeded, resolved after split) -- **Escalation**: NORMAL -- **Duration**: 18m 32s - -
-Governance Summary - -| Layer | Evaluations | Denials | -|-------|------------|---------| -| Capabilities | 142 | 1 | -| Policies | 141 | 2 | -| Invariants | 139 | 1 | - -
-``` - ---- - -## 8. Minimal Viable Architecture - -### Phase 1: One Planner + One Coder + Governance - -The minimal loop that produces useful experiments: - -``` -roadmap.md (human-written) - ↓ -planner agent (read_resource only) - ↓ -GitHub Issue (task definition + file scope) - ↓ -coder agent (capability-bound) - ↓ -every tool call → AgentGuard kernel - ↓ -policy checks + invariant checks - ↓ -allow / deny - ↓ -commit → PR → human review → merge - ↓ -CI -``` - -### What Already Exists in AgentGuard - -| Component | Status | Location | -|-----------|--------|----------| -| Action type normalization (AAB) | Exists | `src/kernel/aab.ts` | -| Policy evaluation | Exists | `src/policy/evaluator.ts` | -| 26 invariant checks | Exists | `packages/invariants/src/definitions.ts` | -| Escalation monitor (4 levels) | Exists | `src/kernel/monitor.ts` | -| JSONL event persistence | Exists | `src/events/jsonl.ts` | -| GovernanceDecisionRecord | Exists | `src/kernel/decisions/` | -| Pre-execution simulation | Exists | `src/kernel/simulation/` | -| Evidence pack generation | Exists | `src/kernel/evidence.ts` | -| Claude Code adapter | Exists | `src/adapters/claude-code.ts` | -| `claude-init` (hook setup) | Exists | `src/cli/commands/claude-init.ts` | -| `claude-hook` (PostToolUse/Bash) | Partial | `src/cli/commands/claude-hook.ts` | - -### What Needs to Be Built - -| Component | Priority | Description | -|-----------|----------|-------------| -| **PreToolUse hook for all tools** | P0 | Extend `claude-hook` to intercept all tools via PreToolUse, run `kernel.propose()`, block unauthorized actions. This is the syscall enforcement layer. | -| **Capability token schema** | P1 | JSON token format with id, subject, operation, scopes, constraints, expiry. Validated before policy evaluation. | -| **Capability validator in kernel** | P1 | New validation step in `kernel.propose()` between AAB normalization and policy evaluation. | -| **External scheduler** | P1 | Standalone process: poll GitHub Issues, mint capabilities, create worktrees, spawn claude CLI, monitor agents, create PRs. | -| **Role-to-capability mapping** | P2 | Expand role assignment into concrete capability tokens per task. | - -### Concrete End-to-End Flow - -**Setup (one-time)**: -```bash -# Install AgentGuard hook in Claude Code -npx aguard claude-init - -# Configure policies -cp policies/developer.yaml .agentguard/active-policy.yaml -``` - -**Per-task flow**: - -1. **Human creates GitHub Issue** with `agentguard-task` + `task:implementation` + `priority:high` labels -2. **Scheduler polls** and finds the pending issue -3. **Scheduler mints capabilities** based on issue's file scope section -4. **Scheduler creates worktree**: `git worktree add ../worktrees/issue-42 -b agent/implementation/issue-42` -5. **Scheduler writes CLAUDE.md** with role template + task description -6. **Scheduler writes capability token** to `.agentguard/capabilities/current.json` in the worktree -7. **Scheduler dispatches task**: via Octi Pulpo HTTP API → ShellForge → Anthropic API, in an isolated worktree -8. **Agent works** — every tool call triggers PreToolUse hook → AgentGuard kernel -9. **Kernel evaluates**: capability check → policy check → invariant check → execute or deny -10. **Agent completes** — commits to worktree branch -11. **Scheduler creates PR** via `gh pr create` -12. **Scheduler posts completion comment** on the issue with governance summary -13. **Human reviews PR** — merges or requests changes -14. **Scheduler cleans up** worktree - ---- - -## 9. Governance Integration - -### Hook Architecture - -The scheduler configures `.claude/settings.json` in each worktree: - -```json -{ - "hooks": { - "PreToolUse": [ - { - "matcher": "*", - "hooks": [ - { - "type": "command", - "command": "npx aguard claude-hook --mode=pre --run-id=run_abc" - } - ] - } - ], - "PostToolUse": [ - { - "matcher": "*", - "hooks": [ - { - "type": "command", - "command": "npx aguard claude-hook --mode=post --run-id=run_abc" - } - ] - } - ] - } -} -``` - -### PreToolUse Flow (Syscall Enforcement) - -``` -Claude Code proposes tool call - → PreToolUse hook fires - → claude-hook reads ClaudeCodeHookPayload from stdin - → normalizeClaudeCodeAction(payload) → RawAgentAction - → Load capability token from .agentguard/capabilities/current.json - → Inject capability + role + taskId into RawAgentAction.metadata - → kernel.propose(rawAction) → KernelResult - → If denied: write denial message to stdout (Claude Code shows it, skips tool call) - → If allowed: exit 0 silently (tool call proceeds) - → Decision persisted to .agentguard/events/.jsonl -``` - -### PostToolUse Flow (Telemetry) - -``` -Claude Code completes tool call - → PostToolUse hook fires - → Log execution result to JSONL - → Increment action counter in .agentguard/state/counter.json - → If action count exceeds max: write "Action limit reached" to stdout -``` - -### Environment Variables - -| Variable | Example | Purpose | -|----------|---------|---------| -| `AGENTGUARD_ROLE` | `developer` | Agent's role | -| `AGENTGUARD_TASK_ID` | `42` | GitHub Issue number | -| `AGENTGUARD_RUN_ID` | `run_1709913400_abc` | Kernel run ID | -| `AGENTGUARD_POLICY` | `policies/developer.yaml` | Policy file | -| `AGENTGUARD_MAX_ACTIONS` | `200` | Action limit | -| `AGENTGUARD_CAP_FILE` | `.agentguard/capabilities/current.json` | Capability token path | - -### Escalation Integration - -The monitor (`src/kernel/monitor.ts`) tracks escalation: - -| Level | Value | Behavior | -|-------|-------|----------| -| NORMAL | 0 | All clear | -| ELEVATED | 1 | Elevated denial rate, continue with caution | -| HIGH | 2 | Significant violations — scheduler pauses | -| LOCKDOWN | 3 | All actions denied | - -When JSONL events show escalation reaching HIGH, the scheduler: -1. Terminates all active agent processes -2. Posts comments on all in-progress issues -3. Updates issue labels to `governance:lockdown` -4. Stops polling until human intervention - ---- - -## 10. Execution Loop - -### Minimal Scheduler Pseudocode - -``` -INITIALIZE: - config = loadSchedulerConfig() - github = createGitHubClient(config.github) - -MAIN LOOP: - while scheduler.running: - - // 1. POLL - issues = github.listIssues({ - labels: ["agentguard-task", "status:pending"], - sort: "created", direction: "asc" - }) - .filter(i => allDependenciesMet(i)) - .sort(byPriority) - - if issues.length === 0 || activeAgents >= config.maxConcurrent: - sleep(config.pollIntervalMs) - continue - - issue = issues[0] - - // 2. DETERMINE ROLE - role = mapTaskTypeToRole(extractLabel(issue, "task:")) - // task:implementation → developer - // task:test-generation → qa - // task:architecture → architect - - // 3. MINT CAPABILITIES - filePaths = parseFileScope(issue.body) - capabilities = mintCapabilityBundle(role, { - taskId: issue.number, - scopes: filePaths, - protectedPaths: SELF_MODIFICATION_DENY_LIST, - expiresIn: config.taskTimeoutMs, - }) - - // 4. CREATE WORKTREE - branch = `agent/${taskType}/issue-${issue.number}` - worktreePath = exec(`git worktree add ../worktrees/${branch} -b ${branch}`) - - // 5. PREPARE WORKTREE - writeFile(`${worktreePath}/CLAUDE.md`, renderRoleTemplate(role, issue)) - writeFile(`${worktreePath}/.agentguard/capabilities/current.json`, capabilities) - writeFile(`${worktreePath}/.claude/settings.json`, renderHookSettings(runId)) - copyFile(`policies/${role}.yaml`, `${worktreePath}/.agentguard/policy.yaml`) - - // 6. UPDATE ISSUE - github.updateLabels(issue.number, { add: ["status:assigned", `role:${role}`] }) - github.postComment(issue.number, renderAssignmentComment({...})) - - // 7. SPAWN AGENT - agentProcess = spawn("claude", ["--print", "-p", renderTaskPrompt(issue)], { - cwd: worktreePath, - env: { AGENTGUARD_ROLE: role, AGENTGUARD_TASK_ID: issue.number, ... }, - timeout: config.taskTimeoutMs, - }) - - github.updateLabels(issue.number, { add: ["status:in-progress"] }) - - // 8. ON COMPLETION - agentProcess.on("exit", async (code) => { - summary = parseJsonlEvents(`${worktreePath}/.agentguard/events/${runId}.jsonl`) - - if code === 0 && summary.escalationLevel < HIGH: - exec(`git push origin ${branch}`, { cwd: worktreePath }) - pr = github.createPR({ - title: `[${role}] ${issue.title}`, - body: renderPRBody(issue, summary, capabilities), - head: branch, base: "main", - }) - github.updateLabels(issue.number, { add: ["status:review"] }) - github.postComment(issue.number, renderCompletionComment(pr, summary)) - else: - retries = extractRetryCount(issue) - if retries < config.maxRetries: - github.updateLabels(issue.number, { - add: ["status:pending", `retry:${retries + 1}`], - remove: ["status:in-progress"] - }) - else: - github.updateLabels(issue.number, { add: ["status:failed"] }) - github.postComment(issue.number, renderFailureComment(summary)) - - exec(`git worktree remove ${worktreePath} --force`) - }) -``` - ---- - -## 11. Policy Configuration - -Role-scoped YAML policies using AgentGuard's existing `PolicyRule` format. - -### Developer Policy (`policies/developer.yaml`) - -```yaml -id: sdlc-developer-policy -name: Developer Agent Policy -description: Governs developer agents — allows src/ and tests/ modifications -severity: 4 - -rules: - - action: file.write - effect: allow - conditions: - scope: ["src/**", "tests/**", "package.json"] - reason: Developer may modify source, tests, and package.json - - - action: file.write - effect: deny - conditions: - scope: [".github/**", "Dockerfile", ".env*", "agentguard.yaml"] - reason: CI, environment, and governance config require human approval - - - action: [file.write, file.delete] - effect: deny - conditions: - scope: ["src/kernel/**", "src/policy/**", "src/invariants/**"] - reason: Self-modification protection — governance code is human-only - - - action: [git.push, deploy.trigger, infra.apply, infra.destroy, npm.publish] - effect: deny - reason: Production-affecting actions require human authorization - - - action: file.read - effect: allow - reason: Reading is always safe - - - action: [shell.exec, git.commit, git.branch.create, npm.install] - effect: allow - reason: Development operations within worktree are safe -``` - -### QA Policy (`policies/qa.yaml`) - -```yaml -id: sdlc-qa-policy -name: QA Agent Policy -severity: 4 - -rules: - - action: file.write - effect: allow - conditions: - scope: ["tests/**", "**/*.test.ts", "**/*.test.js", "**/*.spec.ts"] - reason: QA writes test files - - - action: file.write - effect: deny - conditions: - scope: ["src/**"] - reason: QA agents do not modify production code - - - action: [test.run, test.run.unit, test.run.integration] - effect: allow - reason: Test execution is QA's primary function - - - action: [git.push, deploy.trigger, npm.publish] - effect: deny - reason: QA cannot push or deploy - - - action: [file.read, shell.exec, git.commit] - effect: allow - reason: Basic development operations -``` - -### Auditor Policy (`policies/auditor.yaml`) - -```yaml -id: sdlc-auditor-policy -name: Auditor Agent Policy -severity: 5 - -rules: - - action: [file.write, file.delete, file.move, git.commit, git.push, npm.install, deploy.trigger] - effect: deny - reason: Auditor is strictly read-only - - - action: [file.read, test.run, test.run.unit, test.run.integration] - effect: allow - reason: Auditor reads code and verifies tests -``` - ---- - -## 12. Telemetry & Observability - -### What Already Exists - -AgentGuard already logs every action as structured events. The `GovernanceDecisionRecord` (`src/kernel/decisions/types.ts`) captures: - -- `action.type` — the syscall (e.g., `file.write`) -- `action.target` — the resource (e.g., `src/kernel/monitor.ts`) -- `action.agent` — the agent ID -- `outcome` — `allow` or `deny` -- `reason` — human-readable explanation -- `policy.matchedPolicyId` — which policy rule applied -- `invariants.violations[]` — which correctness checks failed -- `monitor.escalationLevel` — current escalation state -- `execution.success` — whether the action succeeded - -Events are persisted to `.agentguard/events/.jsonl`, one JSON per line. - -### Enhanced Telemetry for Autonomous SDLC - -The autonomous loop adds capability context to each event (via `metadata`): - -```json -{ - "id": "evt_1709913600_42", - "kind": "ActionDenied", - "timestamp": 1709913600000, - "fingerprint": "a1b2c3d4", - "actionType": "file.write", - "target": "src/kernel/kernel.ts", - "reason": "Self-modification protection — governance code is human-only", - "metadata": { - "agent_id": "agent_dev_1a2b", - "role": "developer", - "task_id": 42, - "capability_id": "cap_0192", - "capability_scope_match": false, - "layer": "capability", - "run_id": "run_1709913400_abc" - } -} -``` - -### The Dogfooding Dividend - -Running this on AgentGuard's own codebase produces empirical data on: - -| Signal | What It Tells You | -|--------|------------------| -| Most-denied action types | Which operations agents attempt unsafely | -| Capability scope violations | Where agents exceed their authority | -| Policy denial patterns | Which governance rules produce the most friction | -| Invariant violation frequency | Which correctness constraints agents hit | -| Escalation triggers | What behavior patterns cause systemic denial rates | -| Self-modification attempts | How often agents try to weaken their own governance | -| Code quality drift | Whether agent-written code maintains standards | - -This telemetry is the raw material for: -- Policy hardening (every violation → policy improvement) -- Agent safety research (real failure modes, not theoretical ones) -- Product evidence (AgentGuard's own development proves its governance model works) - ---- - -## 13. Recommended AgentGuard Enhancements - -Prioritized changes to AgentGuard's codebase to support the autonomous SDLC: - -### Priority 0 — Extended `claude-hook` (Syscall Enforcement Layer) - -**File**: `src/cli/commands/claude-hook.ts` - -Currently: PostToolUse only, Bash only, reports errors. -Needed: PreToolUse for all tools, full `kernel.propose()` evaluation, deny messages to stdout. - -This is the **critical enabler**. Without PreToolUse enforcement, agents can execute unauthorized actions before governance evaluates them. - -**Changes**: -- Accept `--mode=pre|post` flag -- Accept `--policy=` and `--run-id=` flags -- In `pre` mode: load policy, load capability token, run `kernel.propose()`, write deny to stdout if blocked -- In `post` mode: log execution result, update action counter -- Process all tool types (not just Bash) - -### Priority 1 — Capability Token Validation - -**New file**: `src/kernel/capability.ts` - -```typescript -interface CapabilityToken { - id: string; - subject: string; - operation: string; - scopes: string[]; - constraints?: { - deny?: string[]; - max_files?: number; - max_diff_lines?: number; - requires_artifacts?: string[]; - }; - issued_to: string; - issued_by: string; - issued_at: string; - expires_at: string; - task_id: string; -} - -function validateCapability(token: CapabilityToken, intent: NormalizedIntent): CapabilityResult; -``` - -**Modify**: `src/kernel/kernel.ts` — add capability validation before `monitor.process()` in `propose()`. - -### Priority 2 — Role/Task as First-Class Fields - -**Modify**: `src/kernel/aab.ts` — add `role?: string` and `taskId?: string` to `RawAgentAction` -**Modify**: `src/policy/evaluator.ts` — add `role?: string` and `taskId?: string` to `NormalizedIntent` - -### Priority 3 — Role-Based Policy Conditions - -**Modify**: `src/policy/evaluator.ts` — add `roles?: string[]` and `ownership?: string[]` to `PolicyRule.conditions` - -### Priority 4 — New Invariants - -**Modify**: `src/invariants/definitions.ts`: -- `architectural-boundary`: Files must be within role's owned paths -- `self-modification-guard`: Governance-critical paths cannot be modified by agents -- `build-must-succeed`: Build passing before commit/push - -### Priority 5 — New Event Kinds - -**Modify**: `src/core/types.ts` — add: `CapabilityDenied`, `CapabilityExpired`, `TaskAssigned`, `TaskCompleted`, `AgentRegistered` - ---- - -## 14. Open Questions & Future Work - -### Open Questions - -1. **PreToolUse blocking semantics**: Does Claude Code's PreToolUse hook reliably block tool execution when stdout contains a denial message? This needs empirical verification for each tool type. - -2. **Capability token signing**: For the minimal viable version, local JSON files are sufficient. For multi-machine deployments, capability tokens need cryptographic signing. What signing scheme? (HMAC is simplest, JWT is most portable.) - -3. **Self-modification boundary**: Which files constitute "governance-critical" code? The current list (`src/kernel/**`, `src/policy/**`, `src/invariants/**`) may be too broad or too narrow. - -4. **Dependency resolution**: When task A depends on task B's unmerged PR, should task A work against task B's branch? Or wait for merge? - -5. **Cost management**: Each agent invocation consumes API tokens. Should the scheduler enforce a per-task or per-day token budget? - -### Future Work - -- **Capability delegation**: Planner agents issue scoped capabilities to downstream agents (requires trust model for the planner). -- **Capability revocation**: Real-time revocation when escalation level rises (currently capabilities have fixed expiry). -- **Parallel pipelines**: Independent tasks in separate worktrees, running concurrently. -- **Agent memory**: Persist learnings across tasks (common denial reasons, preferred patterns). -- **Automated review**: Auditor agent reviews PRs before human review. -- **Conflict detection**: Detect overlapping file modifications across concurrent worktrees. -- **Policy learning**: Analyze denial telemetry to suggest policy rule adjustments. -- **Safety benchmarks**: Publish agent failure telemetry as a reproducible safety benchmark. - -### The Long-Term Vision - -If the reflexive model works — AgentGuard governing its own development — the system demonstrates a concrete answer to the question: "How do you let AI agents do real work while maintaining deterministic control?" - -The answer is not "trust the model" or "add guardrails." The answer is: - -``` -capability-secured syscall interface - → deterministic policy evaluation - → correctness invariant enforcement - → structured audit trail -``` - -That is the architectural primitive. Everything else is scaffolding. diff --git a/docs/autonomous-sdlc-methodology.md b/docs/autonomous-sdlc-methodology.md deleted file mode 100644 index 19d9151a..00000000 --- a/docs/autonomous-sdlc-methodology.md +++ /dev/null @@ -1,283 +0,0 @@ -# Autonomous SDLC Methodology: How 70K Lines Were Built in Under Two Weeks - -## Executive Summary - -AgentGuard — a 70,000-line, 20-package governed action runtime — was built in under two weeks by a single engineer orchestrating an autonomous agent swarm. This document describes the methodology, provides evidence of the velocity achieved, and explains why this approach is transferable to any software domain. - -**Key metrics:** -- 33,668 lines of production TypeScript, 39,160 lines of tests -- 20 workspace packages, 3 applications (CLI, VS Code extension, telemetry server) -- 142 test files, 26 invariants, 47 event kinds, 41 action types -- Equivalent to 240 story points / 6-8 months of traditional senior engineering -- Delivered in <2 weeks by one person with an autonomous agent swarm - ---- - -## The Core Insight - -Governance is not a constraint on velocity. It is the **enabler** of velocity. - -When agents cannot accidentally break production, delete secrets, or corrupt the repository, they can run continuously without human supervision. The entire SDLC — implementation, review, testing, merging, documentation — becomes autonomous. - ---- - -## 1. Swarm Architecture: 26 Agents Across 5 Tiers - -The swarm is organized into tiers with distinct responsibilities and cadences: - -### Core Tier (8 agents, every 2 hours) -| Agent | Role | -|-------|------| -| coder-agent | Implements issues on feature branches | -| code-review-agent | Reviews open PRs for correctness, style, and safety | -| pr-merger-agent | Auto-merges approved PRs with passing CI | -| ci-triage-agent | Triages CI failures (skip-if-green) | -| merge-conflict-resolver | Resolves merge conflicts (serialized, 1 PR/run) | -| pr-review-responder | Responds to unresolved review comments | -| stale-branch-janitor | Daily cleanup of stale branches and PRs | -| recovery-controller | Self-healing for swarm health | - -### Governance Tier (3 agents) -| Agent | Role | -|-------|------| -| governance-monitor | Daily governance audit + policy effectiveness | -| risk-escalation-agent | Cumulative risk assessment, gates dangerous operations | -| recovery-controller | Detects unhealthy conditions, executes remediation | - -### Operations Tier (8 agents, daily) -| Agent | Role | -|-------|------| -| planning-agent | Sprint planning + strategy ingestion | -| observability-agent | SRE health analysis | -| backlog-steward | ROADMAP expansion (capped 3 issues/run) | -| docs-sync-agent | Documentation staleness detection | -| product-agent | Product health + roadmap alignment | -| progress-controller | Tracks roadmap phase transitions | -| repo-hygiene-agent | Detects stale/solved issues | -| retrospective-agent | Weekly failure pattern analysis | - -### Quality Tier (5 agents) -| Agent | Role | -|-------|------| -| test-agent | Daily test health + coverage analysis | -| test-generation-agent | Weekly test generation for untested modules | -| security-audit-agent | Weekly dependency + source code security scan | -| architect-agent | Reviews open PRs for architectural concerns | -| cicd-hardening-agent | Weekly CI/CD supply chain audit | - -### Marketing Tier (1 agent) -| Agent | Role | -|-------|------| -| marketing-content-agent | Weekly content generation | - ---- - -## 2. Skill Composition: 39 Reusable Building Blocks - -Each agent composes its workflow from reusable skills. A skill is a markdown template that defines a discrete capability with clear inputs, outputs, and STOP conditions. - -**Development Skills:** start-governance-runtime, sync-main, discover-next-issue, claim-issue, implement-issue, run-tests, create-pr - -**Governance Skills:** governance-log-audit, policy-effectiveness-review, recovery-controller, risk-escalation - -**Quality Skills:** full-test, test-health-review, generate-tests, dependency-security-audit, security-code-scan, architecture-review, cicd-hardening-audit - -**Operational Skills:** sprint-planning, observability-review, backlog-steward, scheduled-docs-sync, product-health-review, progress-controller, repo-hygiene, retrospective, resolve-merge-conflicts, respond-to-pr-reviews, stale-branch-janitor, triage-failing-ci - -**Artifact Skills:** release-prepare, release-publish, marketing-content - -### Example: The Coder Agent Workflow - -``` -1. start-governance-runtime → activate PreToolUse hooks -2. sync-main → pull latest from origin -3. discover-next-issue → find highest-priority pending issue -4. claim-issue → label with status:in-progress -5. implement-issue → code on feature branch -6. run-tests → pnpm test, fix failures -7. create-pr → push + open PR with evidence summary - -If any skill reports STOP, end the run and report why. -``` - -Each agent's prompt includes a critical autonomy directive: - -> "This is an unattended scheduled task. No human is present. NEVER pause to ask for clarification. Default to the safest option." - -### Claude Desktop Configuration - -Scaffolding generates the files, but each agent must also be configured in **Claude Desktop** to run autonomously: - -1. **Create scheduled tasks** — Register each agent as a scheduled task in Claude Desktop using the cron schedules from the agent manifest. Point the task at the agent's prompt file (e.g., `.claude/prompts/coder-agent.md`). - -2. **Set `worktree: true`** — Enable worktree isolation for every scheduled task. Each agent run gets its own isolated git worktree, preventing parallel agents from conflicting on file writes, git index locks, and branch state. Without this, concurrent agents will corrupt each other's working directory. - -3. **Bypass permissions manually** — Since agents run unattended with no human to approve tool-use prompts, you must pre-approve the required permissions (file read/write, shell execution, git operations) for each agent in Claude Desktop. AgentGuard's governance policy and invariant system provide a second layer of defense, but Claude Desktop's permission gates must be resolved before the agent can start. - ---- - -## 3. Governance as Execution Boundary - -Every agent tool call passes through the AgentGuard kernel before execution: - -``` -Agent proposes tool call (Write, Bash, etc.) - ↓ -PreToolUse hook fires → AgentGuard kernel - ↓ -AAB normalizes intent (tool → action type, detect destructive patterns) - ↓ -Three-layer evaluation: - Layer 1: Capabilities — can this agent attempt this action class? - Layer 2: Policies — is this allowed under governance rules? - Layer 3: Invariants — would this violate system correctness? - ↓ -Decision recorded (JSONL/SQLite audit trail) - ↓ -ALLOW → tool executes | DENY → tool blocked, agent informed -``` - -### 17 Built-in Invariants - -1. no-secret-exposure (severity 5) — block .env, .pem, .key, credentials -2. protected-branch (severity 5) — deny direct push to main/master -3. blast-radius (severity 4) — file modification count limits -4. test-before-push (severity 3) — tests must pass before push -5. no-force-push (severity 5) — block git push --force -6. lockfile-integrity (severity 3) — lock file consistency -7. no-skill-modification (severity 5) — block agent skill self-modification -8. no-scheduled-task-modification (severity 5) — block cron/schedule changes -9. credential-file-creation (severity 5) — block new .netrc, AWS credentials -10. package-script-injection (severity 4) — protect lifecycle scripts -11. recursive-operation-guard (severity 2) — catch find -exec + write/delete -12. large-file-write (severity 3) — per-file size limits -13. no-cicd-config-modification (severity 5) — protect .github/workflows -14. no-permission-escalation (severity 4) — prevent chmod/chown abuse -15. no-governance-self-modification (severity 5) — protect governance config -16. no-container-config-modification (severity 3) — protect Dockerfile -17. no-env-var-modification (severity 3) — prevent shell profile injection - -### Escalation State Machine - -Repeated violations trigger escalation that constrains agent behavior: - -``` -NORMAL → ELEVATED → HIGH → LOCKDOWN -``` - -- **NORMAL**: all issues eligible -- **ELEVATED**: prefer smaller-scope issues -- **HIGH**: only 5-file-or-fewer issues -- **LOCKDOWN**: refuse new work entirely - ---- - -## 4. Distributed Coordination via Shared State - -Agents coordinate without a centralized scheduler through a shared state contract: - -```json -{ - "mode": "normal", - "prQueueHealthy": true, - "openAgentPRs": 3, - "currentPhase": "Phase 5", - "escalationLevel": "NORMAL", - "lastSync": "2026-03-15T10:43:47Z", - "priorities": ["issue-336", "issue-335"] -} -``` - -**Pre-flight checks prevent queue saturation:** -- If `openAgentPRs >= 5`, the coder agent skips its run -- If `escalationLevel === "LOCKDOWN"`, no new work is started -- If `mode === "SAFE"`, only fixes are attempted - -This enables parallel execution across 26 agents without coordination overhead. - ---- - -## 5. The Reflexive Property: Self-Governing Development - -AgentGuard governs the agents that develop AgentGuard. This creates: - -1. **Empirical behavior data** — every agent failure, policy violation, and unsafe pattern becomes research data that improves the product -2. **Live safety lab** — the development process is itself a governance testbed -3. **Recursive validation** — if AgentGuard governs its own development and the development succeeds, the governance model is validated - -This is not a toy demo. The 70K-line codebase, passing test suite, and functioning CI/CD pipeline are proof that the governance model works under real conditions. - ---- - -## 6. Velocity Evidence - -### Commit Timeline (3 days of recorded history) - -| Date | Commits | Highlights | -|------|---------|------------| -| Mar 13 | 10 | Permission escalation invariant, SQLite migrations, SQL aggregation, policy suggestions, governance self-modification invariant | -| Mar 14 | 25 | Webhook storage, monorepo restructure, telemetry server, swarm package, privacy-first telemetry, performance benchmarks, plan-level simulation | -| Mar 15 | 16 | Session viewer, tamper-resistant audit trail, OpenClaw adapter, dependency graph simulator, auto-build hooks, agent persona capture | - -### Parallel Execution Evidence - -Bursts of 3 commits within 2 minutes are visible throughout the log — a signature of multiple agents completing work simultaneously on independent branches: - -``` -2026-03-14 20:30:25 | feat(issue-395): add performance benchmark suite -2026-03-14 20:30:58 | fix(issue-401): fix Windows path separators -2026-03-14 20:31:21 | feat(issue-416): add transitive effect analysis invariant -``` - -PR numbers reach #449, indicating hundreds of PRs were opened, reviewed, and merged autonomously. - -### Engineering Equivalent - -| Metric | Value | -|--------|-------| -| Story points (estimated) | ~240 SP | -| Traditional solo engineer time | 6-8 months | -| Traditional 2-person team time | 3-4 months | -| Actual time with agent swarm | <2 weeks | -| Velocity multiplier | 10-15x | -| Dollar equivalent (agency rates) | $190-250K | - ---- - -## 7. Why This Is Transferable - -This methodology is not specific to AgentGuard. The pattern generalizes: - -1. **Define a governance kernel** — deterministic policy evaluation for agent actions -2. **Create reusable skills** — template library of domain-specific capabilities -3. **Personalize agents** — distinct prompts with role-specific autonomy directives -4. **Hook into execution** — intercept tool calls before execution -5. **Audit everything** — canonical event model captures all decisions -6. **Coordinate via state** — shared JSON enables distributed orchestration -7. **Self-govern development** — agents developing the system are governed by the system - -Any software project can adopt this pattern. The governance kernel prevents the failure modes that make autonomous agents dangerous (secret exposure, destructive commands, unreviewed merges), while the skill composition system enables rapid domain-specific customization. - ---- - -## Academic Foundations - -The architecture draws from three established computer science fields: - -- **Reference Monitors** (Anderson, 1972) — every action checked against policy -- **Capability-Based Security** (Dennis & Van Horn, 1966) — bounded, explicit authority -- **Event Sourcing** (Domain-Driven Design) — all state changes as immutable events - ---- - -## Conclusion - -The autonomous SDLC methodology demonstrated here achieves 10-15x velocity over traditional engineering by: - -1. Running 26 specialized agents in parallel on 2-hour cycles -2. Enforcing governance boundaries that make unsupervised execution safe -3. Composing workflows from 39 reusable skills -4. Coordinating through shared state rather than human standup meetings -5. Auditing every decision for accountability and learning - -The result is not just a product — it is proof that governed autonomous agents can reliably produce production-grade software at a pace that was previously impossible. diff --git a/docs/roadmap-overview.md b/docs/roadmap-overview.md index d7e11be3..104be022 100644 --- a/docs/roadmap-overview.md +++ b/docs/roadmap-overview.md @@ -39,7 +39,7 @@ AgentGuard is the **mandatory execution control plane for AI agents** — the ru | MCP governance server (15 tools) | Implemented | Production | | Plugin ecosystem (discovery, registry, sandboxing) | Implemented | Production | | 8 policy packs (essentials, strict, ci-safe, enterprise, open-source, soc2, hipaa, eng-standards) | Implemented | Production | -| 26-agent autonomous swarm templates | Implemented | Production | +| Multi-agent governance templates | Implemented | Production | | KE-1 Structured matchers (Aho-Corasick, globs, reason codes) | **Shipped v2.3.0** | `packages/matchers/` | | All 47 event kinds mapped to cloud AgentEvent | **Shipped v2.3.0** | `packages/telemetry/src/event-mapper.ts` | | Agent SDK for programmatic governance | **Shipped v2.3.0** | Programmatic governance integration | diff --git a/docs/superpowers/plans/2026-03-25-squad-swarm-implementation.md b/docs/superpowers/plans/2026-03-25-squad-swarm-implementation.md deleted file mode 100644 index 9aec2319..00000000 --- a/docs/superpowers/plans/2026-03-25-squad-swarm-implementation.md +++ /dev/null @@ -1,1125 +0,0 @@ -# Squad Swarm Structure Implementation Plan - -> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. - -**Goal:** Restructure the flat 26-agent swarm into 3 product squads (Kernel, Cloud, QA) with EM→Director→Human reporting, Copilot CLI as workhorse, 5-layer loop guards, and squad identity flowing through telemetry to the dashboard. - -**Architecture:** Extend `@red-codes/swarm` types with squad hierarchy (`SquadManifest`, `Squad`, `SquadAgent`). Each squad writes its own state file. Loop guards are checked by every agent at run start. Identity format `driver:model:squad:rank` is parsed from existing `agent_id` fields — no schema migration needed. - -**Tech Stack:** TypeScript (swarm package), YAML (squad manifest), JSON (squad state), existing `@red-codes/swarm` + `@red-codes/core` packages - -**Spec:** `docs/superpowers/specs/2026-03-25-squad-swarm-structure-design.md` - ---- - -### Task 1: Extend swarm types with squad hierarchy - -**Files:** -- Modify: `packages/swarm/src/types.ts` -- Create: `packages/swarm/tests/types.test.ts` - -- [ ] **Step 1: Write the failing test** - -```typescript -// packages/swarm/tests/types.test.ts -import { describe, it, expect } from 'vitest'; -import type { - SquadManifest, - Squad, - SquadAgent, - SquadRank, - SquadState, - LoopGuardConfig, -} from '../src/types.js'; - -describe('Squad types', () => { - it('SquadAgent has driver, model, squad, rank fields', () => { - const agent: SquadAgent = { - id: 'kernel-senior', - rank: 'senior', - driver: 'copilot-cli', - model: 'sonnet', - cron: '0 */2 * * *', - skills: ['claim-issue', 'implement-issue', 'create-pr'], - }; - expect(agent.driver).toBe('copilot-cli'); - expect(agent.rank).toBe('senior'); - }); - - it('Squad contains em + 5 agents', () => { - const squad: Squad = { - name: 'kernel', - repo: 'agent-guard', - em: { - id: 'kernel-em', - rank: 'em', - driver: 'claude-code', - model: 'opus', - cron: '0 */3 * * *', - skills: ['squad-plan', 'squad-execute'], - }, - agents: { - 'product-lead': { id: 'kernel-pl', rank: 'product-lead', driver: 'claude-code', model: 'sonnet', cron: '0 6 * * *', skills: [] }, - architect: { id: 'kernel-arch', rank: 'architect', driver: 'claude-code', model: 'opus', cron: '0 */4 * * *', skills: [] }, - senior: { id: 'kernel-sr', rank: 'senior', driver: 'copilot-cli', model: 'sonnet', cron: '0 */2 * * *', skills: [] }, - junior: { id: 'kernel-jr', rank: 'junior', driver: 'copilot-cli', model: 'copilot', cron: '0 */2 * * *', skills: [] }, - qa: { id: 'kernel-qa', rank: 'qa', driver: 'copilot-cli', model: 'sonnet', cron: '0 */3 * * *', skills: [] }, - }, - }; - expect(Object.keys(squad.agents)).toHaveLength(5); - expect(squad.em.rank).toBe('em'); - }); - - it('SquadManifest has director + squads', () => { - const manifest: SquadManifest = { - version: '1.0.0', - org: { - director: { id: 'director', rank: 'director', driver: 'claude-code', model: 'opus', cron: '0 7,19 * * *', skills: [] }, - }, - squads: {}, - loopGuards: { - maxOpenPRsPerSquad: 3, - maxRetries: 3, - maxBlastRadius: 20, - maxRunMinutes: 10, - }, - }; - expect(manifest.org.director.rank).toBe('director'); - }); - - it('SquadRank includes all valid ranks', () => { - const ranks: SquadRank[] = ['director', 'em', 'product-lead', 'architect', 'senior', 'junior', 'qa']; - expect(ranks).toHaveLength(7); - }); -}); -``` - -- [ ] **Step 2: Run test to verify it fails** - -Run: `pnpm vitest run packages/swarm/tests/types.test.ts` -Expected: FAIL — types not exported - -- [ ] **Step 3: Add squad types to types.ts** - -Add to `packages/swarm/src/types.ts`: - -```typescript -// --- Squad hierarchy types --- - -export type SquadRank = 'director' | 'em' | 'product-lead' | 'architect' | 'senior' | 'junior' | 'qa'; -export type AgentDriver = 'claude-code' | 'copilot-cli'; -export type AgentModel = 'opus' | 'sonnet' | 'haiku' | 'copilot'; - -export interface SquadAgent { - readonly id: string; - readonly rank: SquadRank; - readonly driver: AgentDriver; - readonly model: AgentModel; - readonly cron: string; - readonly skills: readonly string[]; -} - -export interface Squad { - readonly name: string; - readonly repo: string; // repo name or '*' for cross-repo - readonly em: SquadAgent; - readonly agents: Readonly>; -} - -export interface SquadManifest { - readonly version: string; - readonly org: { - readonly director: SquadAgent; - }; - readonly squads: Readonly>; - readonly loopGuards: LoopGuardConfig; -} - -export interface LoopGuardConfig { - readonly maxOpenPRsPerSquad: number; - readonly maxRetries: number; - readonly maxBlastRadius: number; - readonly maxRunMinutes: number; -} - -export interface SquadState { - readonly squad: string; - readonly sprint: { - readonly goal: string; - readonly issues: readonly string[]; - }; - readonly assignments: Readonly>; - readonly blockers: readonly string[]; - readonly prQueue: { - readonly open: number; - readonly reviewed: number; - readonly mergeable: number; - }; - readonly updatedAt: string; -} - -export interface EMReport { - readonly squad: string; - readonly timestamp: string; - readonly health: 'green' | 'yellow' | 'red'; - readonly summary: string; - readonly blockers: readonly string[]; - readonly escalations: readonly string[]; - readonly metrics: { - readonly prsOpened: number; - readonly prsMerged: number; - readonly issuesClosed: number; - readonly denials: number; - readonly retries: number; - }; -} - -export interface DirectorBrief { - readonly timestamp: string; - readonly squads: Readonly>; - readonly escalationsForHuman: readonly string[]; - readonly overallHealth: 'green' | 'yellow' | 'red'; -} -``` - -Export from `packages/swarm/src/index.ts`. - -- [ ] **Step 4: Run tests** - -Run: `pnpm vitest run packages/swarm/tests/types.test.ts` -Expected: PASS - -- [ ] **Step 5: Commit** - -```bash -git add packages/swarm/src/types.ts packages/swarm/src/index.ts packages/swarm/tests/types.test.ts -git commit -m "feat(swarm): add squad hierarchy types — SquadManifest, Squad, SquadAgent, LoopGuardConfig" -``` - ---- - -### Task 2: Squad manifest YAML loader - -**Files:** -- Create: `packages/swarm/src/squad-manifest.ts` -- Create: `packages/swarm/tests/squad-manifest.test.ts` -- Create: `packages/swarm/templates/config/squad-manifest.default.yaml` - -- [ ] **Step 1: Create the default manifest YAML** - -```yaml -# packages/swarm/templates/config/squad-manifest.default.yaml -version: "1.0.0" - -org: - director: - id: director - rank: director - driver: claude-code - model: opus - cron: "0 7,19 * * *" - skills: [squad-status, director-brief, escalation-router] - -squads: - kernel: - repo: agent-guard - em: - id: kernel-em - rank: em - driver: claude-code - model: opus - cron: "0 */3 * * *" - skills: [squad-plan, squad-execute, squad-status, squad-retro, escalation-router] - agents: - product-lead: - id: kernel-pl - rank: product-lead - driver: claude-code - model: sonnet - cron: "0 6 * * *" - skills: [sprint-planning, roadmap-expand, backlog-steward, learn] - architect: - id: kernel-arch - rank: architect - driver: claude-code - model: opus - cron: "0 */4 * * *" - skills: [architecture-review, review-open-prs, eval, evolve] - senior: - id: kernel-sr - rank: senior - driver: copilot-cli - model: sonnet - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, create-pr, run-tests] - junior: - id: kernel-jr - rank: junior - driver: copilot-cli - model: copilot - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, run-tests, generate-tests] - qa: - id: kernel-qa - rank: qa - driver: copilot-cli - model: sonnet - cron: "0 */3 * * *" - skills: [e2e-testing, compliance-test, test-health-review, learn, prune] - - cloud: - repo: agentguard-cloud - em: - id: cloud-em - rank: em - driver: claude-code - model: opus - cron: "0 */3 * * *" - skills: [squad-plan, squad-execute, squad-status, squad-retro, escalation-router] - agents: - product-lead: - id: cloud-pl - rank: product-lead - driver: claude-code - model: sonnet - cron: "0 6 * * *" - skills: [sprint-planning, roadmap-expand, backlog-steward, learn] - architect: - id: cloud-arch - rank: architect - driver: claude-code - model: opus - cron: "0 */4 * * *" - skills: [architecture-review, review-open-prs, eval, evolve] - senior: - id: cloud-sr - rank: senior - driver: copilot-cli - model: sonnet - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, create-pr, run-tests] - junior: - id: cloud-jr - rank: junior - driver: copilot-cli - model: copilot - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, run-tests, generate-tests] - qa: - id: cloud-qa - rank: qa - driver: copilot-cli - model: sonnet - cron: "0 */3 * * *" - skills: [e2e-testing, compliance-test, test-health-review, learn, prune] - - qa: - repo: "*" - em: - id: qa-em - rank: em - driver: claude-code - model: sonnet - cron: "0 */3 * * *" - skills: [squad-plan, squad-execute, squad-status, squad-retro, escalation-router] - agents: - product-lead: - id: qa-pl - rank: product-lead - driver: claude-code - model: sonnet - cron: "0 6 * * *" - skills: [sprint-planning, test-strategy, stranger-test-plan, learn] - architect: - id: qa-arch - rank: architect - driver: claude-code - model: sonnet - cron: "0 */4 * * *" - skills: [test-architecture, compliance-review, eval, evolve] - senior: - id: qa-sr - rank: senior - driver: copilot-cli - model: sonnet - cron: "0 */2 * * *" - skills: [playwright-e2e, stranger-test-run, compliance-test, create-pr] - junior: - id: qa-jr - rank: junior - driver: copilot-cli - model: copilot - cron: "0 */2 * * *" - skills: [generate-tests, run-tests, test-data-generation] - qa: - id: qa-qa - rank: qa - driver: copilot-cli - model: haiku - cron: "0 */1 * * *" - skills: [e2e-testing, regression-analysis, flakiness-detection, learn, prune] - -loopGuards: - maxOpenPRsPerSquad: 3 - maxRetries: 3 - maxBlastRadius: 20 - maxRunMinutes: 10 -``` - -- [ ] **Step 2: Write the failing test** - -```typescript -// packages/swarm/tests/squad-manifest.test.ts -import { describe, it, expect } from 'vitest'; -import { loadSquadManifest } from '../src/squad-manifest.js'; -import { readFileSync } from 'fs'; -import { join } from 'path'; - -describe('loadSquadManifest', () => { - it('loads the default manifest', () => { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - const manifest = loadSquadManifest(yaml); - expect(manifest.version).toBe('1.0.0'); - expect(manifest.org.director.rank).toBe('director'); - expect(manifest.org.director.driver).toBe('claude-code'); - }); - - it('parses all 3 squads', () => { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - const manifest = loadSquadManifest(yaml); - expect(Object.keys(manifest.squads)).toEqual(['kernel', 'cloud', 'qa']); - }); - - it('each squad has em + 5 agents', () => { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - const manifest = loadSquadManifest(yaml); - for (const [name, squad] of Object.entries(manifest.squads)) { - expect(squad.em.rank).toBe('em'); - expect(Object.keys(squad.agents)).toHaveLength(5); - } - }); - - it('builds agent identity strings', () => { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - const manifest = loadSquadManifest(yaml); - const sr = manifest.squads.kernel.agents.senior; - const identity = `${sr.driver}:${sr.model}:kernel:${sr.rank}`; - expect(identity).toBe('copilot-cli:sonnet:kernel:senior'); - }); - - it('parses loop guard config', () => { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - const manifest = loadSquadManifest(yaml); - expect(manifest.loopGuards.maxOpenPRsPerSquad).toBe(3); - expect(manifest.loopGuards.maxRetries).toBe(3); - expect(manifest.loopGuards.maxBlastRadius).toBe(20); - expect(manifest.loopGuards.maxRunMinutes).toBe(10); - }); -}); -``` - -- [ ] **Step 3: Run test to verify it fails** - -Run: `pnpm vitest run packages/swarm/tests/squad-manifest.test.ts` -Expected: FAIL — `loadSquadManifest` not found - -- [ ] **Step 4: Implement squad manifest loader** - -```typescript -// packages/swarm/src/squad-manifest.ts -import { parse } from 'yaml'; -import type { SquadManifest, Squad, SquadAgent, LoopGuardConfig } from './types.js'; - -export function loadSquadManifest(yamlContent: string): SquadManifest { - const raw = parse(yamlContent) as Record; - - const org = raw.org as Record; - const director = parseAgent(org.director as Record); - - const rawSquads = raw.squads as Record>; - const squads: Record = {}; - - for (const [name, rawSquad] of Object.entries(rawSquads)) { - const em = parseAgent(rawSquad.em as Record); - const rawAgents = rawSquad.agents as Record>; - const agents: Record = {}; - for (const [role, rawAgent] of Object.entries(rawAgents)) { - agents[role] = parseAgent(rawAgent); - } - squads[name] = { - name, - repo: rawSquad.repo as string, - em, - agents, - }; - } - - const rawGuards = raw.loopGuards as Record; - const loopGuards: LoopGuardConfig = { - maxOpenPRsPerSquad: rawGuards.maxOpenPRsPerSquad ?? 3, - maxRetries: rawGuards.maxRetries ?? 3, - maxBlastRadius: rawGuards.maxBlastRadius ?? 20, - maxRunMinutes: rawGuards.maxRunMinutes ?? 10, - }; - - return { - version: raw.version as string, - org: { director }, - squads, - loopGuards, - }; -} - -function parseAgent(raw: Record): SquadAgent { - return { - id: raw.id as string, - rank: raw.rank as SquadAgent['rank'], - driver: raw.driver as SquadAgent['driver'], - model: raw.model as SquadAgent['model'], - cron: raw.cron as string, - skills: (raw.skills as string[]) ?? [], - }; -} - -/** Build the 4-part identity string: driver:model:squad:rank */ -export function buildAgentIdentity(agent: SquadAgent, squadName: string): string { - return `${agent.driver}:${agent.model}:${squadName}:${agent.rank}`; -} - -/** Parse a 4-part identity string back into components */ -export function parseAgentIdentity(identity: string): { - driver: string; - model: string; - squad: string; - rank: string; -} | null { - const parts = identity.split(':'); - if (parts.length < 4) return null; - return { - driver: parts[0], - model: parts[1], - squad: parts[2], - rank: parts[3], - }; -} -``` - -Add `yaml` dependency: `pnpm add yaml --filter=@red-codes/swarm` - -Export from `packages/swarm/src/index.ts`. - -- [ ] **Step 5: Run tests** - -Run: `pnpm vitest run packages/swarm/tests/squad-manifest.test.ts` -Expected: PASS - -- [ ] **Step 6: Commit** - -```bash -git add packages/swarm/ -git commit -m "feat(swarm): squad manifest YAML loader with identity builder/parser" -``` - ---- - -### Task 3: Squad state reader/writer - -**Files:** -- Create: `packages/swarm/src/squad-state.ts` -- Create: `packages/swarm/tests/squad-state.test.ts` - -- [ ] **Step 1: Write the failing test** - -```typescript -// packages/swarm/tests/squad-state.test.ts -import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { readSquadState, writeSquadState, readEMReport, writeEMReport, readDirectorBrief, writeDirectorBrief } from '../src/squad-state.js'; -import { mkdtempSync, rmSync, mkdirSync } from 'fs'; -import { join } from 'path'; -import { tmpdir } from 'os'; - -describe('squad state', () => { - let dir: string; - - beforeEach(() => { - dir = mkdtempSync(join(tmpdir(), 'squad-')); - mkdirSync(join(dir, '.agentguard', 'squads', 'kernel'), { recursive: true }); - }); - - afterEach(() => { - rmSync(dir, { recursive: true, force: true }); - }); - - it('writes and reads squad state', () => { - const state = { - squad: 'kernel', - sprint: { goal: 'Go kernel Phase 2', issues: ['#860'] }, - assignments: { - senior: { current: '#860', status: 'implementing' }, - }, - blockers: [], - prQueue: { open: 1, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - writeSquadState(dir, 'kernel', state); - const read = readSquadState(dir, 'kernel'); - expect(read?.squad).toBe('kernel'); - expect(read?.sprint.goal).toBe('Go kernel Phase 2'); - }); - - it('returns null for missing state', () => { - const read = readSquadState(dir, 'nonexistent'); - expect(read).toBeNull(); - }); - - it('writes and reads EM report', () => { - const report = { - squad: 'kernel', - timestamp: new Date().toISOString(), - health: 'green' as const, - summary: 'All clear', - blockers: [], - escalations: [], - metrics: { prsOpened: 2, prsMerged: 1, issuesClosed: 3, denials: 0, retries: 0 }, - }; - writeEMReport(dir, 'kernel', report); - const read = readEMReport(dir, 'kernel'); - expect(read?.health).toBe('green'); - }); - - it('writes and reads director brief', () => { - const brief = { - timestamp: new Date().toISOString(), - squads: {}, - escalationsForHuman: ['Need decision on Go vs Rust for hot path'], - overallHealth: 'yellow' as const, - }; - writeDirectorBrief(dir, brief); - const read = readDirectorBrief(dir); - expect(read?.overallHealth).toBe('yellow'); - expect(read?.escalationsForHuman).toHaveLength(1); - }); -}); -``` - -- [ ] **Step 2: Run test to verify it fails** - -Run: `pnpm vitest run packages/swarm/tests/squad-state.test.ts` -Expected: FAIL — functions not found - -- [ ] **Step 3: Implement state reader/writer** - -```typescript -// packages/swarm/src/squad-state.ts -import { readFileSync, writeFileSync, existsSync, mkdirSync } from 'node:fs'; -import { join } from 'node:path'; -import type { SquadState, EMReport, DirectorBrief } from './types.js'; - -function squadDir(root: string, squad: string): string { - return join(root, '.agentguard', 'squads', squad); -} - -function ensureDir(dir: string): void { - if (!existsSync(dir)) mkdirSync(dir, { recursive: true }); -} - -export function readSquadState(root: string, squad: string): SquadState | null { - const path = join(squadDir(root, squad), 'state.json'); - if (!existsSync(path)) return null; - try { - return JSON.parse(readFileSync(path, 'utf8')) as SquadState; - } catch { - return null; - } -} - -export function writeSquadState(root: string, squad: string, state: SquadState): void { - const dir = squadDir(root, squad); - ensureDir(dir); - writeFileSync(join(dir, 'state.json'), JSON.stringify(state, null, 2), 'utf8'); -} - -export function readEMReport(root: string, squad: string): EMReport | null { - const path = join(squadDir(root, squad), 'em-report.json'); - if (!existsSync(path)) return null; - try { - return JSON.parse(readFileSync(path, 'utf8')) as EMReport; - } catch { - return null; - } -} - -export function writeEMReport(root: string, squad: string, report: EMReport): void { - const dir = squadDir(root, squad); - ensureDir(dir); - writeFileSync(join(dir, 'em-report.json'), JSON.stringify(report, null, 2), 'utf8'); -} - -export function readDirectorBrief(root: string): DirectorBrief | null { - const path = join(root, '.agentguard', 'director-brief.json'); - if (!existsSync(path)) return null; - try { - return JSON.parse(readFileSync(path, 'utf8')) as DirectorBrief; - } catch { - return null; - } -} - -export function writeDirectorBrief(root: string, brief: DirectorBrief): void { - const dir = join(root, '.agentguard'); - ensureDir(dir); - writeFileSync(join(dir, 'director-brief.json'), JSON.stringify(brief, null, 2), 'utf8'); -} -``` - -Export from `packages/swarm/src/index.ts`. - -- [ ] **Step 4: Run tests** - -Run: `pnpm vitest run packages/swarm/tests/squad-state.test.ts` -Expected: PASS - -- [ ] **Step 5: Commit** - -```bash -git add packages/swarm/ -git commit -m "feat(swarm): squad state reader/writer for state, EM reports, director briefs" -``` - ---- - -### Task 4: Loop guards - -**Files:** -- Create: `packages/swarm/src/loop-guards.ts` -- Create: `packages/swarm/tests/loop-guards.test.ts` - -- [ ] **Step 1: Write the failing test** - -```typescript -// packages/swarm/tests/loop-guards.test.ts -import { describe, it, expect } from 'vitest'; -import { checkLoopGuards } from '../src/loop-guards.js'; -import type { LoopGuardConfig, SquadState } from '../src/types.js'; - -const defaultGuards: LoopGuardConfig = { - maxOpenPRsPerSquad: 3, - maxRetries: 3, - maxBlastRadius: 20, - maxRunMinutes: 10, -}; - -describe('loop guards', () => { - it('passes when all guards clear', () => { - const state: SquadState = { - squad: 'kernel', - sprint: { goal: 'test', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 1, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - const result = checkLoopGuards(defaultGuards, state, { - retryCount: 0, - predictedFileChanges: 5, - runStartTime: Date.now(), - }); - expect(result.allowed).toBe(true); - expect(result.violations).toHaveLength(0); - }); - - it('fails budget guard when too many PRs open', () => { - const state: SquadState = { - squad: 'kernel', - sprint: { goal: 'test', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 4, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - const result = checkLoopGuards(defaultGuards, state, { - retryCount: 0, - predictedFileChanges: 5, - runStartTime: Date.now(), - }); - expect(result.allowed).toBe(false); - expect(result.violations).toContain('budget'); - }); - - it('fails retry guard after 3 retries', () => { - const state: SquadState = { - squad: 'kernel', - sprint: { goal: 'test', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 0, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - const result = checkLoopGuards(defaultGuards, state, { - retryCount: 4, - predictedFileChanges: 5, - runStartTime: Date.now(), - }); - expect(result.allowed).toBe(false); - expect(result.violations).toContain('retry'); - }); - - it('fails blast radius guard when too many files', () => { - const state: SquadState = { - squad: 'kernel', - sprint: { goal: 'test', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 0, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - const result = checkLoopGuards(defaultGuards, state, { - retryCount: 0, - predictedFileChanges: 25, - runStartTime: Date.now(), - }); - expect(result.allowed).toBe(false); - expect(result.violations).toContain('blast-radius'); - }); - - it('fails time guard when run exceeds limit', () => { - const state: SquadState = { - squad: 'kernel', - sprint: { goal: 'test', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 0, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - const result = checkLoopGuards(defaultGuards, state, { - retryCount: 0, - predictedFileChanges: 5, - runStartTime: Date.now() - 11 * 60 * 1000, // 11 min ago - }); - expect(result.allowed).toBe(false); - expect(result.violations).toContain('time'); - }); - - it('reports multiple violations', () => { - const state: SquadState = { - squad: 'kernel', - sprint: { goal: 'test', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 5, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - const result = checkLoopGuards(defaultGuards, state, { - retryCount: 4, - predictedFileChanges: 25, - runStartTime: Date.now() - 15 * 60 * 1000, - }); - expect(result.allowed).toBe(false); - expect(result.violations.length).toBeGreaterThanOrEqual(3); - }); -}); -``` - -- [ ] **Step 2: Run test to verify it fails** - -Run: `pnpm vitest run packages/swarm/tests/loop-guards.test.ts` -Expected: FAIL — `checkLoopGuards` not found - -- [ ] **Step 3: Implement loop guards** - -```typescript -// packages/swarm/src/loop-guards.ts -import type { LoopGuardConfig, SquadState } from './types.js'; - -export interface LoopGuardContext { - retryCount: number; - predictedFileChanges: number; - runStartTime: number; -} - -export type GuardViolation = 'budget' | 'retry' | 'blast-radius' | 'cascade' | 'time'; - -export interface LoopGuardResult { - allowed: boolean; - violations: GuardViolation[]; - messages: string[]; -} - -export function checkLoopGuards( - config: LoopGuardConfig, - state: SquadState, - context: LoopGuardContext, -): LoopGuardResult { - const violations: GuardViolation[] = []; - const messages: string[] = []; - - // 1. Budget guard - if (state.prQueue.open >= config.maxOpenPRsPerSquad) { - violations.push('budget'); - messages.push( - `PR budget exceeded: ${state.prQueue.open} open (max ${config.maxOpenPRsPerSquad}). Skip implementation, focus on review/merge.`, - ); - } - - // 2. Retry guard - if (context.retryCount > config.maxRetries) { - violations.push('retry'); - messages.push( - `Retry limit exceeded: ${context.retryCount} attempts (max ${config.maxRetries}). Create escalation issue.`, - ); - } - - // 3. Blast radius guard - if (context.predictedFileChanges > config.maxBlastRadius) { - violations.push('blast-radius'); - messages.push( - `Blast radius exceeded: ${context.predictedFileChanges} files (max ${config.maxBlastRadius}). Escalate to Architect.`, - ); - } - - // 4. Time guard - const elapsedMs = Date.now() - context.runStartTime; - const elapsedMin = elapsedMs / 60_000; - if (elapsedMin > config.maxRunMinutes) { - violations.push('time'); - messages.push( - `Run time exceeded: ${Math.round(elapsedMin)}min (max ${config.maxRunMinutes}min). Force-stop, EM investigates.`, - ); - } - - return { - allowed: violations.length === 0, - violations, - messages, - }; -} -``` - -Export from `packages/swarm/src/index.ts`. - -- [ ] **Step 4: Run tests** - -Run: `pnpm vitest run packages/swarm/tests/loop-guards.test.ts` -Expected: PASS - -- [ ] **Step 5: Commit** - -```bash -git add packages/swarm/ -git commit -m "feat(swarm): 5-layer loop guards — budget, retry, blast radius, cascade, time" -``` - ---- - -### Task 5: Identity format integration + scaffolder update - -**Files:** -- Modify: `packages/swarm/src/scaffolder.ts` -- Create: `packages/swarm/tests/squad-scaffold.test.ts` - -- [ ] **Step 1: Write the failing test** - -```typescript -// packages/swarm/tests/squad-scaffold.test.ts -import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { scaffoldSquad } from '../src/scaffolder.js'; -import { loadSquadManifest } from '../src/squad-manifest.js'; -import { readFileSync, mkdtempSync, rmSync, existsSync } from 'fs'; -import { join } from 'path'; -import { tmpdir } from 'os'; - -describe('scaffoldSquad', () => { - let dir: string; - - beforeEach(() => { - dir = mkdtempSync(join(tmpdir(), 'scaffold-')); - }); - - afterEach(() => { - rmSync(dir, { recursive: true, force: true }); - }); - - it('creates .agentguard-identity for each agent', () => { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - const manifest = loadSquadManifest(yaml); - scaffoldSquad(dir, 'kernel', manifest.squads.kernel); - - // Check that squad state dir exists - expect(existsSync(join(dir, '.agentguard', 'squads', 'kernel'))).toBe(true); - }); - - it('creates initial empty squad state', () => { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - const manifest = loadSquadManifest(yaml); - scaffoldSquad(dir, 'kernel', manifest.squads.kernel); - - const statePath = join(dir, '.agentguard', 'squads', 'kernel', 'state.json'); - expect(existsSync(statePath)).toBe(true); - const state = JSON.parse(readFileSync(statePath, 'utf8')); - expect(state.squad).toBe('kernel'); - }); -}); -``` - -- [ ] **Step 2: Run test to verify it fails** - -Run: `pnpm vitest run packages/swarm/tests/squad-scaffold.test.ts` -Expected: FAIL — `scaffoldSquad` not found - -- [ ] **Step 3: Implement scaffoldSquad** - -Add to `packages/swarm/src/scaffolder.ts`: - -```typescript -import { mkdirSync, writeFileSync, existsSync } from 'node:fs'; -import { join } from 'node:path'; -import type { Squad, SquadState } from './types.js'; - -export function scaffoldSquad(root: string, squadName: string, squad: Squad): void { - const squadDir = join(root, '.agentguard', 'squads', squadName); - mkdirSync(squadDir, { recursive: true }); - - // Write initial squad state - const initialState: SquadState = { - squad: squadName, - sprint: { goal: '', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 0, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - - const statePath = join(squadDir, 'state.json'); - if (!existsSync(statePath)) { - writeFileSync(statePath, JSON.stringify(initialState, null, 2), 'utf8'); - } - - // Write learnings store - const learningsPath = join(squadDir, 'learnings.json'); - if (!existsSync(learningsPath)) { - writeFileSync(learningsPath, '[]', 'utf8'); - } -} -``` - -Export from `packages/swarm/src/index.ts`. - -- [ ] **Step 4: Run tests** - -Run: `pnpm vitest run packages/swarm/tests/squad-scaffold.test.ts` -Expected: PASS - -- [ ] **Step 5: Commit** - -```bash -git add packages/swarm/ -git commit -m "feat(swarm): scaffoldSquad creates state dirs, initial state, learnings store" -``` - ---- - -### Task 6: Build, full test suite, docs update - -- [ ] **Step 1: Run full swarm package tests** - -```bash -pnpm test --filter=@red-codes/swarm -``` -Expected: all PASS - -- [ ] **Step 2: Run full monorepo tests** - -```bash -pnpm build && pnpm test -``` -Expected: all PASS (no regressions from new types/exports) - -- [ ] **Step 3: Update swarm README** - -Add a "Squad Structure" section to `packages/swarm/README.md` documenting: -- Squad manifest schema -- Identity format (`driver:model:squad:rank`) -- Loop guards -- State file locations -- Migration from 26-agent flat pool - -- [ ] **Step 4: Commit** - -```bash -git add packages/swarm/ -git commit -m "docs(swarm): update README with squad structure, identity format, loop guards" -``` - ---- - -### Task 7: Deploy manifest + scaffold squads across repos - -- [ ] **Step 1: Copy default manifest to workspace repos** - -```bash -# agent-guard -cp packages/swarm/templates/config/squad-manifest.default.yaml .agentguard/squad-manifest.yaml - -# agentguard-cloud -cp packages/swarm/templates/config/squad-manifest.default.yaml ../agentguard-cloud/.agentguard/squad-manifest.yaml -``` - -- [ ] **Step 2: Run scaffold for each squad** - -Create a script or run manually: - -```bash -# From agent-guard root -node -e " -const { loadSquadManifest, scaffoldSquad } = require('./packages/swarm/dist/index.js'); -const { readFileSync } = require('fs'); -const yaml = readFileSync('.agentguard/squad-manifest.yaml', 'utf8'); -const manifest = loadSquadManifest(yaml); -for (const [name, squad] of Object.entries(manifest.squads)) { - if (squad.repo === 'agent-guard' || squad.repo === '*') { - scaffoldSquad('.', name, squad); - console.log('Scaffolded squad:', name); - } -} -" -``` - -- [ ] **Step 3: Verify state files created** - -```bash -ls -la .agentguard/squads/kernel/ -ls -la .agentguard/squads/qa/ -# Expected: state.json, learnings.json in each -``` - -- [ ] **Step 4: Commit manifests and state files** - -```bash -git add .agentguard/squad-manifest.yaml .agentguard/squads/ -git commit -m "chore: deploy squad manifest and scaffold kernel + qa squads" -``` diff --git a/docs/superpowers/specs/2026-03-21-agent-identity-and-worktree-enforcement-design.md b/docs/superpowers/specs/2026-03-21-agent-identity-and-worktree-enforcement-design.md deleted file mode 100644 index e72774e7..00000000 --- a/docs/superpowers/specs/2026-03-21-agent-identity-and-worktree-enforcement-design.md +++ /dev/null @@ -1,158 +0,0 @@ -# Agent Identity Hard Gate & Worktree Policy Enforcement - -**Date:** 2026-03-21 -**Status:** Approved - -## Overview - -Two features that work together to support governed autonomous agent swarms: - -1. **Agent Identity Hard Gate** — Every governance session requires agent identity. No session starts without it. -2. **Worktree Policy Enforcement** — YAML policy can require worktree usage via `requireWorktree` rule condition, with new `git.worktree.*` action types. - ---- - -## Feature 1: Agent Identity Hard Gate - -### Problem - -AgentGuard sessions start without knowing which agent is running. The kernel never emits a `RUN_STARTED` event. In swarm scenarios, there's no way to distinguish which agent took which actions. The `.agentguard-identity` file exists but nothing reads it. - -### Design - -#### Identity Resolution (ordered, first wins) - -1. `--agent-name ` CLI flag on `aguard guard` -2. `AGENTGUARD_AGENT_NAME` environment variable (per-process) -3. Interactive prompt (writes answer for session duration) - -#### Stateless File Contract - -- **Session start:** blank `.agentguard-identity` (wipe stale values) -- **After resolution:** write resolved identity to `.agentguard-identity` (read window for other tools) -- **Session end:** blank `.agentguard-identity` again -- File is `.gitignore`d — never committed, purely session-scoped output - -#### Changes - -| File | Change | -|------|--------| -| `packages/core/src/types.ts` | Add `agentName: string` to `RunManifest` | -| `packages/kernel/src/kernel.ts` | Emit `RUN_STARTED` event with `agentName`; reject `propose()` if no identity | -| `apps/cli/src/commands/guard.ts` | Add `--agent-name` flag; resolve identity before kernel creation; blank/write/blank lifecycle | -| `.gitignore` | Add `.agentguard-identity` | -| `apps/cli/src/bin.ts` or `args.ts` | Wire `--agent-name` option to guard command | - -#### RUN_STARTED Event Payload - -```typescript -{ - kind: 'RUN_STARTED', - runId: string, - timestamp: string, - payload: { - agentName: string, - sessionId: string, - policy: string | undefined, - manifest: RunManifest - } -} -``` - -#### Autonomous Agent Flow - -Orchestrators (swarm scaffolder, CI) pass identity via CLI flag: -```bash -aguard guard --agent-name "builder-agent-3" --policy agentguard.yaml -``` - -No need to pre-write files. Each subprocess gets its own `--agent-name`. - -#### Interactive Flow - -``` -$ aguard guard -⚠ No agent identity set. -Agent name: █ -> my-dev-session -✓ Identity set: my-dev-session -``` - ---- - -## Feature 2: Worktree Policy Enforcement - -### Problem - -No way to enforce worktree usage through YAML policy. Agents can `git checkout` freely, which in swarm scenarios causes conflicts in shared repos. - -### Design - -#### New Action Types - -Add to `packages/core/src/data/actions.json`: - -| Action Type | Class | Description | -|------------|-------|-------------| -| `git.worktree.add` | git | Create a git worktree | -| `git.worktree.remove` | git | Remove a git worktree | -| `git.worktree.list` | git | List git worktrees | - -#### AAB Detection - -In `packages/kernel/src/aab.ts`, detect shell commands: -- `git worktree add ...` → `git.worktree.add` -- `git worktree remove ...` / `git worktree prune` → `git.worktree.remove` -- `git worktree list` → `git.worktree.list` - -#### Policy Condition: `requireWorktree` - -Rule-level condition (like `requireTests`). When `requireWorktree: true` is set on a rule matching `git.checkout` or `git.branch.create`, the action is denied with a message directing the agent to use worktrees instead. - -```yaml -rules: - - action: git.checkout - effect: deny - conditions: - requireWorktree: true - reason: "Use 'git worktree add ' instead of checkout" - - - action: git.worktree.add - effect: allow - - - action: git.worktree.list - effect: allow -``` - -#### Changes - -| File | Change | -|------|--------| -| `packages/core/src/data/actions.json` | Add `git.worktree.add/remove/list` | -| `packages/core/src/actions.ts` | Update action type union if needed | -| `packages/kernel/src/aab.ts` | Detect `git worktree` commands, normalize to action types | -| `packages/policy/src/evaluator.ts` | Add `requireWorktree` condition to rule matching | -| `packages/policy/src/yaml-loader.ts` | Accept `requireWorktree` in YAML schema | -| `packages/adapters/src/git.ts` | Add handlers for worktree action types | -| `packages/adapters/src/shell.ts` | Add worktree commands to privilege profiles | -| `packages/core/src/data/git-action-patterns.json` | Add worktree patterns | - -#### Evaluator Logic - -```typescript -// In policy evaluator, when checking conditions: -if (rule.conditions?.requireWorktree && action.type === 'git.checkout') { - return { - effect: 'deny', - reason: rule.reason ?? 'Worktree required: use git worktree add instead of checkout' - }; -} -``` - ---- - -## Testing - -- **Identity gate:** test resolution order (flag > env > prompt), stateless file lifecycle, kernel rejects propose() without identity, RUN_STARTED event emission -- **Worktree actions:** test AAB normalization of `git worktree` commands, policy evaluation with `requireWorktree` condition, adapter execution -- **Integration:** guard command with `--agent-name`, guard command with interactive prompt (mock stdin) diff --git a/docs/superpowers/specs/2026-03-25-squad-swarm-structure-design.md b/docs/superpowers/specs/2026-03-25-squad-swarm-structure-design.md deleted file mode 100644 index f8e12360..00000000 --- a/docs/superpowers/specs/2026-03-25-squad-swarm-structure-design.md +++ /dev/null @@ -1,347 +0,0 @@ -# Agent Swarm Squad Structure — Design Spec - -**Date:** 2026-03-25 -**Status:** Approved -**Author:** Jared + Claude - -## Problem - -The current agent swarm is a flat pool of 26 agents in 5 functional tiers. Agents step on each other, there's no code ownership, no review chain, and no accountability per area. When 8 PRs pile up with 1 conflicted, nobody owns the resolution. The swarm coordination via `swarm-state.json` works for state sharing but has no hierarchy for decision-making or escalation. - -The goal: restructure the swarm into squads that operate like an engineering org — clear ownership, built-in review chains, seniority-based model assignment, multi-vendor governance (Claude Code + Copilot CLI), and telemetry that flows from agent identity through the database to the dashboard and office-sim. - -## Design - -### Squad Structure - -3 squads, each with 5 agents + 1 EM. 1 Director. 1 Human. - -``` -Human (Jared) - └── Director (Opus, claude-code) - ├── EM: Kernel Squad (Opus, claude-code) - │ ├── Product Lead (Sonnet, claude-code) - │ ├── Architect (Opus, claude-code) - │ ├── Senior Engineer (Sonnet, copilot-cli) - │ ├── Junior Engineer (Copilot, copilot-cli) - │ └── QA Automation (Sonnet, copilot-cli) - │ - ├── EM: Cloud Squad (Opus, claude-code) - │ ├── Product Lead (Sonnet, claude-code) - │ ├── Architect (Opus, claude-code) - │ ├── Senior Engineer (Sonnet, copilot-cli) - │ ├── Junior Engineer (Copilot, copilot-cli) - │ └── QA Automation (Sonnet, copilot-cli) - │ - └── EM: QA Squad (Sonnet, claude-code) - ├── Product Lead (Sonnet, claude-code) - ├── Architect (Sonnet, claude-code) - ├── Senior Engineer (Sonnet, copilot-cli) - ├── Junior Engineer (Copilot, copilot-cli) - └── QA Automation (Haiku, copilot-cli) -``` - -**Total: 15 squad agents + 3 EMs + 1 Director = 19 agents** - -**Ownership:** -- **Kernel Squad** — `agent-guard/` (Go kernel, TS kernel, policy, invariants, matchers, CLI, npm publishing) -- **Cloud Squad** — `agentguard-cloud/` (dashboard, API server, telemetry, office-sim, Vercel deployment) -- **QA Squad** — Cross-repo (stranger tests, Playwright e2e, compliance suite, swarm health, CI/CD) - -### Model & Driver Assignment - -Copilot CLI is a first-class workhorse — 9 of 15 squad agents use it. Claude Code is reserved for leadership and architecture roles. - -| Role | Model | Driver | Rationale | -|------|-------|--------|-----------| -| Director | Opus | claude-code | Cross-squad reasoning needs depth | -| EM | Opus | claude-code | Architectural context for coordination | -| Product Lead | Sonnet | claude-code | Prioritization, roadmap alignment | -| Architect | Opus | claude-code | Design reviews need deepest reasoning | -| Senior Engineer | Sonnet | copilot-cli | Fast implementation, high volume | -| Junior Engineer | Copilot | copilot-cli | Cheapest, highest volume tasks | -| QA Automation | Sonnet/Haiku | copilot-cli | Fast test execution, high frequency | - -This stress-tests multi-vendor governance — every Copilot CLI action goes through the same AgentGuard kernel as Claude Code actions via the `copilot-cli` adapter. - -### Coordination Protocol - -Three layers: squad-internal, cross-squad, human escalation. - -**Squad-internal (files — fast, every run):** - -Each squad writes to `.agentguard/squads/{squad-name}/state.json`: -```json -{ - "squad": "kernel", - "sprint": { "goal": "Go kernel Phase 2", "issues": ["#860", "#862"] }, - "assignments": { - "architect": { "current": "#860", "status": "reviewing" }, - "senior": { "current": "#862", "status": "implementing" }, - "junior": { "current": "#863", "status": "writing-tests" }, - "qa": { "current": null, "waiting": "senior to push" } - }, - "blockers": [], - "prQueue": { "open": 2, "reviewed": 1, "mergeable": 1 } -} -``` - -PL sets priorities. Architect reviews before merge. Senior implements. Junior writes tests + handles chores. QA runs e2e after merge. EM reads state, unblocks, and escalates. - -**Cross-squad (EM → Director — aggregated files):** - -EMs write to `.agentguard/squads/{squad-name}/em-report.json`. Director reads all EM reports and writes `.agentguard/director-brief.json` — the human's daily summary. - -**Human escalation (GitHub issues):** - -When Director needs the human: creates an issue labeled `escalation:human`. When an EM is stuck: creates an issue labeled `escalation:em`. Human comments to resolve. Agent polls for response. - -### Adopted Patterns (from everything-claude-code) - -**1. Continuous Learning Cycle** - -Each squad maintains `.agentguard/squads/{squad-name}/learnings.json`. Four operations: - -- **Learn** — After every PR merge, QA extracts patterns: what worked, what broke, what was surprising. Stored as `{pattern, confidence, source_pr, timestamp}`. -- **Eval** — Weekly, Architect scores accumulated patterns (0-1 confidence). Low-confidence patterns get more evidence or get dropped. -- **Evolve** — Monthly, PL clusters high-confidence patterns into new squad-specific skills. A pattern in 3+ PRs becomes a skill file. -- **Prune** — Monthly, EM removes stale learnings (>30 days, low confidence, superseded). - -**2. Autonomous Loop Guards (5-layer)** - -Every agent run checks before proceeding: - -1. **Budget guard** — Max 3 PRs open per squad. Exceeded → skip implementation, only review/merge. -2. **Retry guard** — Same action failed 3 times → stop, create escalation issue. -3. **Blast radius guard** — Predicted file changes > 20 → pause, escalate to Architect. -4. **Cascade guard** — 2+ squads both modifying same file → pause both, escalate to Director. -5. **Time guard** — Agent run exceeds 10 minutes → force-stop, log partial progress, EM investigates. - -**3. Orchestration Commands** - -EM-level commands for squad task management: - -- `/squad-plan {goal}` — PL + Architect decompose goal into tasks, assign to Sr/Jr/QA -- `/squad-execute` — EM dispatches all assigned tasks in dependency order -- `/squad-status` — EM aggregates squad state into one-line summary per agent -- `/squad-retro` — QA + Architect review last sprint's merged PRs, extract learnings - -**4. Skill Expansion (39 → 125+)** - -Current 39 skills are SDLC-focused. Expand with: -- Per-language skills (Go, Python, TypeScript patterns) -- Per-squad skills (kernel: policy testing, invariant validation; cloud: Vercel deployment, DB migrations) -- Business skills (content generation, changelog writing, release notes) -- Learning skills (pattern extraction, confidence scoring, skill evolution) -- Autonomous loop skills (DAG orchestration, sequential pipelines, PR loops) - -Skills are composable — agents reference which skills they use. Squads inherit shared skills + add squad-specific ones. Manifest-driven selective install (from everything-claude-code) prevents skill bloat. - -### Agent Identity & Manifest - -**Identity format:** `{driver}:{model}:{squad}:{rank}` - -Examples: -- `claude-code:opus:kernel:architect` -- `copilot-cli:copilot:cloud:junior` -- `claude-code:opus:director:director` - -**Manifest structure** — extends existing `packages/swarm/manifest.json`: - -```yaml -# .agentguard/squad-manifest.yaml -org: - director: - id: director - driver: claude-code - model: opus - cron: "0 7,19 * * *" - skills: [squad-status, director-brief, escalation-router] - -squads: - kernel: - repo: agent-guard - em: - id: kernel-em - driver: claude-code - model: opus - cron: "0 */3 * * *" - skills: [squad-plan, squad-execute, squad-status, squad-retro, escalation-router] - agents: - product-lead: - driver: claude-code - model: sonnet - cron: "0 6 * * *" - skills: [sprint-planning, roadmap-expand, backlog-steward, learn] - architect: - driver: claude-code - model: opus - cron: "0 */4 * * *" - skills: [architecture-review, review-open-prs, eval, evolve] - senior: - driver: copilot-cli - model: sonnet - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, create-pr, run-tests] - junior: - driver: copilot-cli - model: copilot - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, run-tests, generate-tests] - qa: - driver: copilot-cli - model: sonnet - cron: "0 */3 * * *" - skills: [e2e-testing, compliance-test, test-health-review, learn, prune] - - cloud: - repo: agentguard-cloud - em: - id: cloud-em - driver: claude-code - model: opus - cron: "0 */3 * * *" - skills: [squad-plan, squad-execute, squad-status, squad-retro, escalation-router] - agents: - product-lead: - driver: claude-code - model: sonnet - cron: "0 6 * * *" - skills: [sprint-planning, roadmap-expand, backlog-steward, learn] - architect: - driver: claude-code - model: opus - cron: "0 */4 * * *" - skills: [architecture-review, review-open-prs, eval, evolve] - senior: - driver: copilot-cli - model: sonnet - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, create-pr, run-tests] - junior: - driver: copilot-cli - model: copilot - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, run-tests, generate-tests] - qa: - driver: copilot-cli - model: sonnet - cron: "0 */3 * * *" - skills: [e2e-testing, compliance-test, test-health-review, learn, prune] - - qa: - repo: "*" - em: - id: qa-em - driver: claude-code - model: sonnet - cron: "0 */3 * * *" - skills: [squad-plan, squad-execute, squad-status, squad-retro, escalation-router] - agents: - product-lead: - driver: claude-code - model: sonnet - cron: "0 6 * * *" - skills: [sprint-planning, test-strategy, stranger-test-plan, learn] - architect: - driver: claude-code - model: sonnet - cron: "0 */4 * * *" - skills: [test-architecture, compliance-review, eval, evolve] - senior: - driver: copilot-cli - model: sonnet - cron: "0 */2 * * *" - skills: [playwright-e2e, stranger-test-run, compliance-test, create-pr] - junior: - driver: copilot-cli - model: copilot - cron: "0 */2 * * *" - skills: [generate-tests, run-tests, test-data-generation] - qa: - driver: copilot-cli - model: haiku - cron: "0 */1 * * *" - skills: [e2e-testing, regression-analysis, flakiness-detection, learn, prune] -``` - -### Migration from Current 26 Agents - -Existing agents don't disappear — they get reassigned into squads: - -| Current Agent | Squad | New Role | -|--------------|-------|----------| -| coder-agent | kernel/cloud | Senior Engineer | -| code-review-agent | kernel/cloud | Architect responsibility | -| pr-merger-agent | kernel/cloud | EM responsibility | -| ci-triage-agent | qa | QA Automation | -| merge-conflict-resolver | kernel/cloud | EM responsibility | -| pr-review-responder | kernel/cloud | Architect responsibility | -| stale-branch-janitor | qa | Junior Engineer chore | -| recovery-controller | — | Director responsibility | -| risk-escalation-agent | — | Director responsibility | -| governance-monitor | qa | QA Architect | -| planning-agent | kernel/cloud | Product Lead | -| backlog-steward | kernel/cloud | Product Lead skill | -| observability-agent | qa | QA Product Lead | -| docs-sync-agent | kernel/cloud | Junior Engineer chore | -| product-agent | kernel/cloud | Product Lead | -| progress-controller | — | Director skill | -| repo-hygiene-agent | qa | Junior Engineer chore | -| retrospective-agent | — | EM skill (squad-retro) | -| test-agent | qa | QA Senior Engineer | -| test-generation-agent | qa | QA Junior Engineer | -| security-audit-agent | qa | QA Architect | -| architect-agent | kernel/cloud | Architect | -| cicd-hardening-agent | qa | QA Senior Engineer | -| audit-merged-prs-agent | qa | QA Automation | -| infrastructure-health-agent | qa | QA Automation | -| marketing-content-agent | cloud | Junior Engineer skill | - -### Database & Telemetry Integration - -Squad identity flows through the entire platform pipeline: - -``` -squad-manifest.yaml → .agentguard-identity → kernel → telemetry → Postgres → dashboard → office-sim -``` - -**Identity propagation:** -1. Agent starts with identity `copilot-cli:sonnet:kernel:senior` in `.agentguard-identity` -2. Governance hook reads identity, attaches to every `GovernanceDecisionRecord` -3. Telemetry `AgentEvent` includes full `agentId` with squad/rank -4. Cloud telemetry server stores events in Postgres with agent identity -5. Dashboard queries by squad, rank, model, driver -6. Office-sim visualizes squads as teams with real-time action streams - -**Database: parse from agentId at query time (no schema change):** - -The identity format `{driver}:{model}:{squad}:{rank}` is parseable: -```sql -SELECT - split_part(agent_id, ':', 3) AS squad, - split_part(agent_id, ':', 4) AS rank, - split_part(agent_id, ':', 1) AS driver, - count(*) AS decisions -FROM agent_events -WHERE timestamp > now() - interval '7 days' -GROUP BY squad, rank, driver -ORDER BY decisions DESC; -``` - -No schema migration needed — the existing `agent_id` text field carries the full hierarchy. - -**Dashboard views enabled:** -- Squad leaderboard (PRs merged, tests written, denials per squad) -- Agent performance by rank (Opus architects vs Sonnet seniors) -- Model cost analysis (Copilot CLI vs Claude Code event volume) -- Escalation funnel (issues reaching EM → Director → Human) -- Office-sim squad visualization (live agent activity grouped by squad) -- Learning cycle metrics (patterns extracted, skills evolved, confidence trends) - -## Non-Goals - -- More than 3 squads initially (add Dashboard squad when justified) -- Custom LLM fine-tuning for agent roles (use prompt engineering via skills) -- Real-time inter-agent chat (async file + issue coordination is sufficient) -- Windows Copilot CLI support (Linux only for now) diff --git a/packages/scheduler/package.json b/packages/scheduler/package.json index cdca028a..9e4e824b 100644 --- a/packages/scheduler/package.json +++ b/packages/scheduler/package.json @@ -1,7 +1,7 @@ { "name": "@red-codes/scheduler", "version": "2.9.3", - "description": "Task scheduler, queue, lease manager, and worker orchestration for AgentGuard swarm", + "description": "Task scheduler, queue, lease manager, and worker orchestration for AgentGuard", "type": "module", "main": "dist/index.js", "types": "dist/index.d.ts", diff --git a/packages/swarm/README.md b/packages/swarm/README.md deleted file mode 100644 index e0fe9807..00000000 --- a/packages/swarm/README.md +++ /dev/null @@ -1,349 +0,0 @@ -# @red-codes/swarm - -**Autonomous agent swarm for software development.** 26 coordinated AI agents that handle your entire SDLC — implementation, code review, CI triage, security audits, planning, and more — all running under governance policy enforcement. - -This is the same swarm that builds AgentGuard itself. - -## What It Does - -The swarm scaffolds a complete autonomous development pipeline into any repository. Agents are scheduled via cron and coordinate through shared state. Every agent action passes through [AgentGuard](https://github.com/AgentGuardHQ/agent-guard) governance — policy evaluation, invariant checking, escalation tracking, and audit logging. - -``` -ROADMAP.md (you write strategy) - │ - ├── Planning Agent (daily) ─── reads roadmap, sets priorities - ├── Backlog Steward (daily) ── expands roadmap into issues - ├── Coder Agent (2-hourly) ─── picks issues, implements, creates PRs - ├── Code Review Agent (2h) ─── reviews PRs for quality - ├── Architect Agent (daily) ── reviews PRs for architecture - ├── CI Triage Agent (hourly) ─ fixes failing CI - ├── PR Merger Agent (2h) ───── auto-merges when gates pass - ├── Security Audit (weekly) ── dependency + code scanning - ├── Recovery Controller (2h) ─ self-healing, detects unhealthy state - └── ... 17 more agents -``` - -## Quick Start - -```bash -# Clone and build AgentGuard -git clone https://github.com/AgentGuardHQ/agent-guard.git -cd agent-guard -pnpm install && pnpm build - -# Link the CLI globally -cd apps/cli && npm link && cd ../.. - -# Scaffold the swarm into your project -cd /path/to/your-project -aguard init swarm - -# This creates: -# agentguard-swarm.yaml — swarm configuration -# .claude/skills/*.md — 39 skill definitions -# agentguard.yaml — governance policy (if missing) -``` - -### Claude Desktop Setup - -After scaffolding, each agent must be registered as a **scheduled task** in Claude Desktop. The scaffolder outputs the full agent manifest with cron schedules — but you still need to configure three things manually: - -#### 1. Create scheduled tasks in Claude Desktop - -Open Claude Desktop and create a scheduled task for each agent you want to run. Use the cron schedule from the agent manifest and point the task at the agent's prompt file (e.g., `.claude/prompts/coder-agent.md`). - -#### 2. Enable worktree isolation - -In each scheduled task's configuration, set **worktree to `true`**. This gives each agent run an isolated git worktree so parallel agents don't interfere with each other's file changes or git state. - -```json -{ - "task": "coder-agent", - "schedule": "0 */2 * * *", - "worktree": true -} -``` - -Without worktree isolation, concurrent agents will conflict on file writes, git index locks, and branch state. - -#### 3. Bypass permissions manually - -Scheduled agents run **unattended** — there is no human to approve tool-use permission prompts. You must manually pre-approve the necessary permissions for each agent before it can run autonomously. In Claude Desktop, configure the agent's permissions to allow the tools it needs (file read/write, shell execution, git operations) without interactive confirmation. - -> **Important:** Review the governance policy (`agentguard.yaml`) before granting broad permissions. AgentGuard's invariant system acts as a second layer of defense, but the policy should be tuned to your project's risk tolerance. - -## Agents - -### Core Tier (7 agents) - -| Agent | Schedule | Role | -|-------|----------|------| -| Implementation Agent | Every 2h | Picks issues, implements code, creates PRs | -| Code Review Agent | Every 2h | Reviews open PRs for quality | -| PR Merger Agent | Every 2h | Auto-merges PRs with passing CI + approvals | -| CI Triage Agent | Hourly | Diagnoses and fixes CI failures | -| Merge Conflict Resolver | Every 2h | Resolves conflicts (1 PR per run) | -| PR Review Responder | Hourly | Responds to review comments | -| Stale Branch Janitor | Daily 8am | Cleans up stale branches and PRs | - -### Governance Tier (3 agents) - -| Agent | Schedule | Role | -|-------|----------|------| -| Recovery Controller | Every 2h | Self-healing — detects and remediates unhealthy swarm state | -| Risk Escalation Agent | Every 4.5h | Cumulative risk assessment, gates dangerous operations | -| Governance Monitor | Daily 2am | Audits governance logs, reviews policy effectiveness | - -### Ops Tier (8 agents) - -| Agent | Schedule | Role | -|-------|----------|------| -| Planning Agent | Daily 6am | Sprint planning, priority setting | -| Backlog Steward | Daily 5am | Expands ROADMAP into issues (max 3/run) | -| Observability Agent | Daily 9am | SRE health monitoring | -| Documentation Maintainer | Daily 11am | Keeps docs in sync with code | -| Product Agent | Daily 7am | Product health and roadmap alignment | -| Progress Controller | Daily 7am | Tracks roadmap phase completion | -| Repo Hygiene Agent | Daily 3am | Detects stale issues | -| Retrospective Agent | Weekly Mon 8am | Failure analysis, lessons learned | - -### Quality Tier (7 agents) - -| Agent | Schedule | Role | -|-------|----------|------| -| Test Agent | Daily 8am | Test health and coverage analysis | -| Test Generation Agent | Weekly Mon 11am | Generates tests for untested modules | -| Security Audit Agent | Weekly Sun 8pm | Dependency + source code security scan | -| Architect Agent | Daily 10am | Architectural review of PRs | -| CI/CD Hardening Agent | Weekly Sun 9pm | Action pinning, permissions, supply chain audit | -| Merged PR Auditor | Weekly Mon 9am | Audits recently merged PRs for missed risks | -| Infrastructure Health Agent | Daily 9pm | SDLC pipeline health check | - -### Marketing Tier (1 agent) - -| Agent | Schedule | Role | -|-------|----------|------| -| Marketing Content Agent | Weekly Mon 9am | Drafts social posts and blog outlines | - -## Configuration - -After scaffolding, customize `agentguard-swarm.yaml`: - -```yaml -swarm: - # Enable/disable agent tiers - tiers: - - core # Essential: coder, reviewer, merger, CI triage - - governance # Risk escalation, recovery, policy audit - - ops # Planning, observability, docs, retrospectives - - quality # Testing, security, architecture review - # - marketing # Content generation (opt-in) - - # Override cron schedules per agent - schedules: - coder-agent: '0 */4 * * *' # Slow down to every 4 hours - ci-triage-agent: '0 */2 * * *' # Less frequent CI checks - - # Project-specific paths - paths: - policy: agentguard.yaml # Governance policy file - roadmap: ROADMAP.md # Your project roadmap - swarmState: .agentguard/swarm-state.json - logs: logs/runtime-events.jsonl - cli: agentguard # How to invoke the CLI - - # Behavioral thresholds - thresholds: - maxOpenPRs: 5 # Coder stops creating PRs above this - prStaleHours: 48 # PRs older than this get flagged - blastRadiusHigh: 16 # Actions above this score get escalated -``` - -## Multi-Project Setup - -The swarm is designed to work across multiple repositories. Each repo gets its own: -- `ROADMAP.md` — defines what the swarm builds -- `agentguard-swarm.yaml` — configures agent behavior -- `agentguard.yaml` — governance policy -- `.agentguard/swarm-state.json` — runtime coordination state - -To run the same swarm on two repos (e.g., an OSS repo and a private enterprise repo): - -```bash -# In your OSS repo -cd ~/oss-project -aguard init swarm - -# In your enterprise repo -cd ~/enterprise-project -aguard init swarm -``` - -Each repo has its own ROADMAP.md driving independent priorities. The agents operate under their respective governance policies. Swarm state is per-repo. - -## How Agents Coordinate - -Agents share state through `.agentguard/swarm-state.json`: - -```json -{ - "mode": "normal", - "currentPhase": "Phase 6", - "prQueueHealthy": true, - "openAgentPRs": 3, - "priorities": [42, 38, 45], - "lastProgressRun": "2026-03-15T07:00:00Z" -} -``` - -- **mode**: `normal` | `conservative` | `safe` — the Recovery Controller escalates when things go wrong -- **prQueueHealthy**: Coder Agent skips when `false` (too many open PRs) -- **priorities**: Planning Agent sets issue priority order -- **currentPhase**: Progress Controller tracks ROADMAP phase - -## Skills - -Each agent executes one or more **skills** — markdown-defined task playbooks in `.claude/skills/`. Skills are composable and reusable across agents. - -All 39 skills are scaffolded from templates with your project-specific paths and labels injected. - -## Governance Integration - -Every agent starts by invoking the `start-governance-runtime` skill, which activates AgentGuard hooks in Claude Code. This means: - -- All file writes, shell commands, and git operations are policy-checked -- Destructive actions are blocked by invariants -- Violations escalate the system (NORMAL → ELEVATED → HIGH → LOCKDOWN) -- Full audit trail in JSONL for every agent action - -## Squad Structure - -The swarm can be organized into **squads** — small, autonomous teams of AI agents with a reporting hierarchy. Each squad has an Engineering Manager (EM) overseeing 5 specialist agents, all reporting up to a single Director agent. - -### Squad Manifest Schema - -Squads are defined in a YAML manifest (`squad-manifest.yaml`): - -```yaml -version: "1.0.0" - -org: - director: - id: director - rank: director - driver: claude-code - model: opus - cron: "0 7,19 * * *" - skills: [squad-status, director-brief, escalation-router] - -squads: - kernel: - repo: agent-guard - em: - id: kernel-em - rank: em - driver: claude-code - model: opus - cron: "0 */3 * * *" - skills: [squad-plan, squad-execute, squad-status] - agents: - product-lead: { id: kernel-pl, rank: product-lead, driver: claude-code, model: sonnet, ... } - architect: { id: kernel-arch, rank: architect, driver: claude-code, model: opus, ... } - senior: { id: kernel-sr, rank: senior, driver: copilot-cli, model: sonnet, ... } - junior: { id: kernel-jr, rank: junior, driver: copilot-cli, model: copilot, ... } - qa: { id: kernel-qa, rank: qa, driver: copilot-cli, model: sonnet, ... } - -loopGuards: - maxOpenPRsPerSquad: 3 - maxRetries: 3 - maxBlastRadius: 20 - maxRunMinutes: 10 -``` - -Each squad agent specifies a `driver` (`claude-code` or `copilot-cli`), a `model` (`opus`, `sonnet`, `haiku`, `copilot`), a `rank`, and a set of skills. Valid ranks: `director`, `em`, `product-lead`, `architect`, `senior`, `junior`, `qa`. - -### Identity Format - -Every agent in a squad has a 4-part identity string: `driver:model:squad:rank`. For example: - -- `claude-code:opus:kernel:em` — the kernel squad's EM running on Claude Code with Opus -- `copilot-cli:sonnet:cloud:senior` — the cloud squad's senior dev running on Copilot CLI with Sonnet - -Identity strings are parsed from agent metadata at runtime and flow through telemetry, so the dashboard can attribute actions to specific agents within a squad. - -```typescript -import { buildAgentIdentity, parseAgentIdentity } from '@red-codes/swarm'; - -const identity = buildAgentIdentity(agent, 'kernel'); -// => "copilot-cli:sonnet:kernel:senior" - -const parsed = parseAgentIdentity('copilot-cli:sonnet:kernel:senior'); -// => { driver: 'copilot-cli', model: 'sonnet', squad: 'kernel', rank: 'senior' } -``` - -### Loop Guards - -Every agent checks **4 loop guards** at run start to prevent runaway behavior: - -| Guard | Config Key | Description | -|-------|-----------|-------------| -| Budget | `maxOpenPRsPerSquad` | Blocks new PRs when the squad has too many open | -| Retry | `maxRetries` | Stops retrying after N consecutive failures | -| Blast Radius | `maxBlastRadius` | Rejects changes touching too many files | -| Time | `maxRunMinutes` | Kills runs that exceed the time limit | - -All four guards must pass for an agent to proceed. Violations are returned with the specific guard names that failed. - -### State File Locations - -Each squad maintains its own state directory under `.agentguard/squads/`: - -``` -.agentguard/ - squads/ - kernel/ - state.json # Current squad state (sprint goal, assignments, PR queue, blockers) - learnings.json # Accumulated learnings from retrospectives - em-report.json # Latest EM health report for the director - cloud/ - state.json - learnings.json - em-report.json - qa/ - state.json - learnings.json - em-report.json - director-brief.json # Aggregated brief from all squad EM reports -``` - -Use `scaffoldSquad` to initialize these directories: - -```typescript -import { loadSquadManifest, scaffoldSquad } from '@red-codes/swarm'; - -const manifest = loadSquadManifest(yamlContent); -for (const [name, squad] of Object.entries(manifest.squads)) { - scaffoldSquad('/path/to/project', name, squad); -} -``` - -## Programmatic API - -```typescript -import { scaffold, loadConfig, loadManifest, filterAgentsByTier } from '@red-codes/swarm'; - -// Scaffold swarm into a project -const result = await scaffold({ - projectRoot: '/path/to/project', - force: false, - tiers: ['core', 'governance'], -}); - -// Load and filter agents -const manifest = loadManifest(); -const coreAgents = filterAgentsByTier(manifest.agents, ['core']); -``` - -## License - -[Apache 2.0](../../LICENSE) diff --git a/packages/swarm/manifest.json b/packages/swarm/manifest.json deleted file mode 100644 index a49c9e01..00000000 --- a/packages/swarm/manifest.json +++ /dev/null @@ -1,239 +0,0 @@ -{ - "version": "1.0.0", - "agents": [ - { - "id": "coder-agent", - "name": "Implementation Agent", - "tier": "core", - "cron": "0 */2 * * *", - "skills": ["start-governance-runtime", "sync-main", "discover-next-issue", "claim-issue", "implement-issue", "run-tests", "create-pr"], - "promptTemplate": "coder-agent", - "description": "Implementation Agent with PR queue gate — runs every 2 hours, skips if PR queue is full" - }, - { - "id": "code-review-agent", - "name": "Code Review Agent", - "tier": "core", - "cron": "0 */2 * * *", - "skills": ["start-governance-runtime", "review-open-prs"], - "promptTemplate": "code-review-agent", - "description": "Code Review Agent — reviews PRs every 2 hours" - }, - { - "id": "pr-merger-agent", - "name": "PR Merger Agent", - "tier": "core", - "cron": "0 */2 * * *", - "skills": ["start-governance-runtime", "pr-merger"], - "promptTemplate": "pr-merger-agent", - "description": "Auto-merge PRs that have passing CI and approved reviews — every 2 hours" - }, - { - "id": "ci-triage-agent", - "name": "CI Triage Agent", - "tier": "core", - "cron": "0 * * * *", - "skills": ["start-governance-runtime", "sync-main", "triage-failing-ci"], - "promptTemplate": "ci-triage-agent", - "description": "CI Triage Agent — hourly with skip-if-green guard" - }, - { - "id": "merge-conflict-resolver", - "name": "Merge Conflict Resolver", - "tier": "core", - "cron": "0 */2 * * *", - "skills": ["start-governance-runtime", "resolve-merge-conflicts"], - "promptTemplate": "merge-conflict-resolver", - "description": "Merge Conflict Resolver — serialized (1 PR/run), every 2 hours, with cascade risk detection" - }, - { - "id": "pr-review-responder", - "name": "PR Review Responder", - "tier": "core", - "cron": "0 * * * *", - "skills": ["start-governance-runtime", "respond-to-pr-reviews"], - "promptTemplate": "pr-review-responder", - "description": "Respond to unresolved PR review comments on agent-authored PRs" - }, - { - "id": "stale-branch-janitor", - "name": "Stale Branch Janitor", - "tier": "core", - "cron": "0 8 * * *", - "skills": ["start-governance-runtime", "stale-branch-janitor"], - "promptTemplate": "stale-branch-janitor", - "description": "Stale Branch Janitor — daily cleanup of stale branches and PRs" - }, - { - "id": "recovery-controller", - "name": "Recovery Controller", - "tier": "governance", - "cron": "0 */2 * * *", - "skills": ["start-governance-runtime", "recovery-controller"], - "promptTemplate": "recovery-controller", - "description": "Self-healing agent that detects unhealthy swarm conditions and executes remediation playbooks" - }, - { - "id": "risk-escalation-agent", - "name": "Risk Escalation Agent", - "tier": "governance", - "cron": "30 */4 * * *", - "skills": ["start-governance-runtime", "risk-escalation"], - "promptTemplate": "risk-escalation-agent", - "description": "Assesses cumulative swarm risk, gates dangerous operations, escalates to humans when needed" - }, - { - "id": "governance-monitor", - "name": "Governance Monitor", - "tier": "governance", - "cron": "0 2 * * *", - "skills": ["start-governance-runtime", "governance-log-audit", "policy-effectiveness-review"], - "promptTemplate": "governance-monitor", - "description": "Governance Agent — combined governance monitoring + policy effectiveness analysis (daily)" - }, - { - "id": "planning-agent", - "name": "Planning Agent", - "tier": "ops", - "cron": "0 6 * * *", - "skills": ["start-governance-runtime", "sprint-planning"], - "promptTemplate": "planning-agent", - "description": "Planning Agent — daily sprint planning with strategy document ingestion and swarm state updates" - }, - { - "id": "observability-agent", - "name": "Observability Agent", - "tier": "ops", - "cron": "0 9 * * *", - "skills": ["start-governance-runtime", "observability-review"], - "promptTemplate": "observability-agent", - "description": "Observability Agent — daily SRE analysis with swarm health monitoring and shared state updates" - }, - { - "id": "backlog-steward", - "name": "Backlog Steward", - "tier": "ops", - "cron": "0 5 * * *", - "skills": ["start-governance-runtime", "backlog-steward", "roadmap-expand"], - "promptTemplate": "backlog-steward", - "description": "Backlog Steward — daily ROADMAP expansion, capped at 3 issues/run" - }, - { - "id": "docs-sync-agent", - "name": "Documentation Maintainer", - "tier": "ops", - "cron": "0 11 * * *", - "skills": ["start-governance-runtime", "scheduled-docs-sync"], - "promptTemplate": "docs-sync-agent", - "description": "Documentation Maintainer — daily docs sync with strategy staleness detection" - }, - { - "id": "product-agent", - "name": "Product Agent", - "tier": "ops", - "cron": "0 7 * * *", - "skills": ["start-governance-runtime", "product-health-review"], - "promptTemplate": "product-agent", - "description": "Product Agent — daily product health and roadmap alignment review" - }, - { - "id": "progress-controller", - "name": "Progress Controller", - "tier": "ops", - "cron": "0 7 * * *", - "skills": ["start-governance-runtime", "progress-controller"], - "promptTemplate": "progress-controller", - "description": "Tracks roadmap phase completion, detects transitions, prevents backlog expansion" - }, - { - "id": "repo-hygiene-agent", - "name": "Repo Hygiene Agent", - "tier": "ops", - "cron": "0 3 * * *", - "skills": ["start-governance-runtime", "repo-hygiene"], - "promptTemplate": "repo-hygiene-agent", - "description": "Detect stale issues, solved issues, and undiscovered work" - }, - { - "id": "retrospective-agent", - "name": "Retrospective Agent", - "tier": "ops", - "cron": "0 8 * * 1", - "skills": ["start-governance-runtime", "retrospective"], - "promptTemplate": "retrospective-agent", - "description": "Analyzes failure patterns, extracts heuristics, publishes weekly retrospective with lessons learned" - }, - { - "id": "test-agent", - "name": "Test Agent", - "tier": "quality", - "cron": "0 8 * * *", - "skills": ["start-governance-runtime", "test-health-review"], - "promptTemplate": "test-agent", - "description": "Test Agent — daily test health review and coverage analysis" - }, - { - "id": "test-generation-agent", - "name": "Test Generation Agent", - "tier": "quality", - "cron": "0 11 * * 1", - "skills": ["start-governance-runtime", "generate-tests"], - "promptTemplate": "test-generation-agent", - "description": "Generate tests for untested source modules (weekly)" - }, - { - "id": "security-audit-agent", - "name": "Security Audit Agent", - "tier": "quality", - "cron": "0 20 * * 0", - "skills": ["start-governance-runtime", "dependency-security-audit", "security-code-scan"], - "promptTemplate": "security-audit-agent", - "description": "Security Agent — combined dependency audit + source code scan (weekly Sunday)" - }, - { - "id": "architect-agent", - "name": "Architect Agent", - "tier": "quality", - "cron": "0 10 * * *", - "skills": ["start-governance-runtime", "architecture-review"], - "promptTemplate": "architect-agent", - "description": "Review open PRs for architectural concerns" - }, - { - "id": "cicd-hardening-agent", - "name": "CI/CD Hardening Agent", - "tier": "quality", - "cron": "0 21 * * 0", - "skills": ["start-governance-runtime", "cicd-hardening-audit"], - "promptTemplate": "cicd-hardening-agent", - "description": "Weekly CI/CD hardening audit — action pinning, permissions, supply chain, governance integration" - }, - { - "id": "audit-merged-prs-agent", - "name": "Merged PR Auditor", - "tier": "quality", - "cron": "0 9 * * 1", - "skills": ["start-governance-runtime", "audit-merged-prs"], - "promptTemplate": "audit-merged-prs-agent", - "description": "Weekly audit of PRs merged in the last 7 days for overlooked risks" - }, - { - "id": "infrastructure-health-agent", - "name": "Infrastructure Health Agent", - "tier": "quality", - "cron": "0 21 * * *", - "skills": ["start-governance-runtime", "sdlc-pipeline-health"], - "promptTemplate": "infrastructure-health-agent", - "description": "Daily infrastructure and SDLC pipeline health check" - }, - { - "id": "marketing-content-agent", - "name": "Marketing Content Agent", - "tier": "marketing", - "cron": "0 9 * * 1", - "skills": ["start-governance-runtime", "marketing-content"], - "promptTemplate": "marketing-content-agent", - "description": "Weekly content generation — drafts LinkedIn posts, Twitter/X posts, and blog outlines" - } - ] -} diff --git a/packages/swarm/package.json b/packages/swarm/package.json deleted file mode 100644 index c8b0d43a..00000000 --- a/packages/swarm/package.json +++ /dev/null @@ -1,38 +0,0 @@ -{ - "name": "@red-codes/swarm", - "version": "2.9.3", - "description": "Agent swarm templates and scaffolder for AgentGuard", - "type": "module", - "main": "dist/index.js", - "types": "dist/index.d.ts", - "exports": { - ".": { - "import": "./dist/index.js", - "types": "./dist/index.d.ts" - } - }, - "files": [ - "dist", - "manifest.json", - "templates" - ], - "scripts": { - "build": "tsc -b", - "ts:check": "tsc --noEmit", - "test": "vitest run", - "lint": "eslint src/" - }, - "dependencies": { - "yaml": "^2.8.3" - }, - "devDependencies": { - "vitest": "^4.1.0", - "typescript": "^5.8.3" - }, - "keywords": [ - "agentguard", - "swarm", - "agent", - "scaffolder" - ] -} diff --git a/packages/swarm/src/config.ts b/packages/swarm/src/config.ts deleted file mode 100644 index ad3e784a..00000000 --- a/packages/swarm/src/config.ts +++ /dev/null @@ -1,61 +0,0 @@ -import { readFileSync, existsSync } from 'node:fs'; -import { join, dirname } from 'node:path'; -import { fileURLToPath } from 'node:url'; -import { parse as parseYaml } from 'yaml'; -import type { SwarmConfig, SwarmTier } from './types.js'; - -const __dirname = dirname(fileURLToPath(import.meta.url)); -const PACKAGE_ROOT = join(__dirname, '..'); -const DEFAULT_CONFIG_PATH = join( - PACKAGE_ROOT, - 'templates', - 'config', - 'agentguard-swarm.default.yaml' -); - -export function loadDefaultConfig(): SwarmConfig { - const raw = readFileSync(DEFAULT_CONFIG_PATH, 'utf8'); - return parseYaml(raw) as SwarmConfig; -} - -export function loadConfig(projectRoot: string): SwarmConfig { - const configPath = join(projectRoot, 'agentguard-swarm.yaml'); - const defaults = loadDefaultConfig(); - - if (!existsSync(configPath)) { - return defaults; - } - - const raw = readFileSync(configPath, 'utf8'); - const userConfig = parseYaml(raw) as Partial; - - return mergeConfig(defaults, userConfig); -} - -function mergeConfig(defaults: SwarmConfig, overrides: Partial): SwarmConfig { - const user = overrides.swarm ?? {}; - - return { - swarm: { - tiers: (user as Record).tiers - ? ((user as Record).tiers as SwarmTier[]) - : defaults.swarm.tiers, - schedules: { - ...defaults.swarm.schedules, - ...((user as Record).schedules as Record | undefined), - }, - paths: { - ...defaults.swarm.paths, - ...((user as Record).paths as Record | undefined), - }, - labels: { - ...defaults.swarm.labels, - ...((user as Record).labels as Record | undefined), - }, - thresholds: { - ...defaults.swarm.thresholds, - ...((user as Record).thresholds as Record | undefined), - }, - }, - }; -} diff --git a/packages/swarm/src/index.ts b/packages/swarm/src/index.ts deleted file mode 100644 index 2dda42af..00000000 --- a/packages/swarm/src/index.ts +++ /dev/null @@ -1,45 +0,0 @@ -export { scaffold, scaffoldSquad } from './scaffolder.js'; -export type { ScaffoldOptions } from './scaffolder.js'; -export { loadConfig, loadDefaultConfig } from './config.js'; -export { loadManifest, filterAgentsByTier, resolveSchedule, collectSkills } from './manifest.js'; -export { loadSquadManifest, buildAgentIdentity, parseAgentIdentity } from './squad-manifest.js'; -export { - readSquadState, - writeSquadState, - readEMReport, - writeEMReport, - readDirectorBrief, - writeDirectorBrief, -} from './squad-state.js'; -export { checkLoopGuards } from './loop-guards.js'; -export { - SWARM_MANIFEST_SCHEMA, - SQUAD_MANIFEST_SCHEMA, - SWARM_CONFIG_SCHEMA, - validateSwarmManifest, - validateSquadManifest, - validateSwarmConfig, -} from './schema.js'; -export type { ValidationError, ValidationResult } from './schema.js'; -export type { LoopGuardContext, GuardViolation, LoopGuardResult } from './loop-guards.js'; -export type { - SwarmAgent, - SwarmConfig, - SwarmManifest, - SwarmTier, - SwarmPaths, - SwarmLabels, - SwarmThresholds, - ScaffoldResult, - ScaffoldedAgent, - SquadRank, - AgentDriver, - AgentModel, - SquadAgent, - Squad, - SquadManifest, - LoopGuardConfig, - SquadState, - EMReport, - DirectorBrief, -} from './types.js'; diff --git a/packages/swarm/src/loop-guards.ts b/packages/swarm/src/loop-guards.ts deleted file mode 100644 index ce8cd61f..00000000 --- a/packages/swarm/src/loop-guards.ts +++ /dev/null @@ -1,64 +0,0 @@ -import type { LoopGuardConfig, SquadState } from './types.js'; - -export interface LoopGuardContext { - retryCount: number; - predictedFileChanges: number; - runStartTime: number; -} - -export type GuardViolation = 'budget' | 'retry' | 'blast-radius' | 'cascade' | 'time'; - -export interface LoopGuardResult { - allowed: boolean; - violations: GuardViolation[]; - messages: string[]; -} - -export function checkLoopGuards( - config: LoopGuardConfig, - state: SquadState, - context: LoopGuardContext -): LoopGuardResult { - const violations: GuardViolation[] = []; - const messages: string[] = []; - - // 1. Budget guard - if (state.prQueue.open >= config.maxOpenPRsPerSquad) { - violations.push('budget'); - messages.push( - `PR budget exceeded: ${state.prQueue.open} open (max ${config.maxOpenPRsPerSquad}). Skip implementation, focus on review/merge.` - ); - } - - // 2. Retry guard - if (context.retryCount > config.maxRetries) { - violations.push('retry'); - messages.push( - `Retry limit exceeded: ${context.retryCount} attempts (max ${config.maxRetries}). Create escalation issue.` - ); - } - - // 3. Blast radius guard - if (context.predictedFileChanges > config.maxBlastRadius) { - violations.push('blast-radius'); - messages.push( - `Blast radius exceeded: ${context.predictedFileChanges} files (max ${config.maxBlastRadius}). Escalate to Architect.` - ); - } - - // 4. Time guard - const elapsedMs = Date.now() - context.runStartTime; - const elapsedMin = elapsedMs / 60_000; - if (elapsedMin > config.maxRunMinutes) { - violations.push('time'); - messages.push( - `Run time exceeded: ${Math.round(elapsedMin)}min (max ${config.maxRunMinutes}min). Force-stop, EM investigates.` - ); - } - - return { - allowed: violations.length === 0, - violations, - messages, - }; -} diff --git a/packages/swarm/src/manifest.ts b/packages/swarm/src/manifest.ts deleted file mode 100644 index 52fda677..00000000 --- a/packages/swarm/src/manifest.ts +++ /dev/null @@ -1,34 +0,0 @@ -import { readFileSync } from 'node:fs'; -import { join, dirname } from 'node:path'; -import { fileURLToPath } from 'node:url'; -import type { SwarmManifest, SwarmAgent, SwarmConfig, SwarmTier } from './types.js'; - -const __dirname = dirname(fileURLToPath(import.meta.url)); -const PACKAGE_ROOT = join(__dirname, '..'); -const MANIFEST_PATH = join(PACKAGE_ROOT, 'manifest.json'); - -export function loadManifest(): SwarmManifest { - const raw = readFileSync(MANIFEST_PATH, 'utf8'); - return JSON.parse(raw) as SwarmManifest; -} - -export function filterAgentsByTier( - agents: readonly SwarmAgent[], - enabledTiers: readonly SwarmTier[] -): SwarmAgent[] { - return agents.filter((a) => enabledTiers.includes(a.tier)); -} - -export function resolveSchedule(agent: SwarmAgent, config: SwarmConfig): string { - return config.swarm.schedules[agent.id] ?? agent.cron; -} - -export function collectSkills(agents: readonly SwarmAgent[]): string[] { - const seen = new Set(); - for (const agent of agents) { - for (const skill of agent.skills) { - seen.add(skill); - } - } - return [...seen].sort(); -} diff --git a/packages/swarm/src/scaffolder.ts b/packages/swarm/src/scaffolder.ts deleted file mode 100644 index 32d9144c..00000000 --- a/packages/swarm/src/scaffolder.ts +++ /dev/null @@ -1,188 +0,0 @@ -import { - readFileSync, - writeFileSync, - existsSync, - mkdirSync, - copyFileSync, - readdirSync, -} from 'node:fs'; -import { join, dirname } from 'node:path'; -import { fileURLToPath } from 'node:url'; -import { loadConfig } from './config.js'; -import { loadManifest, filterAgentsByTier, resolveSchedule, collectSkills } from './manifest.js'; -import type { SwarmConfig, ScaffoldResult, ScaffoldedAgent, Squad, SquadState } from './types.js'; - -const __dirname = dirname(fileURLToPath(import.meta.url)); -const PACKAGE_ROOT = join(__dirname, '..'); -const TEMPLATES_DIR = join(PACKAGE_ROOT, 'templates'); -const SKILLS_TEMPLATE_DIR = join(TEMPLATES_DIR, 'skills'); -const PROMPTS_TEMPLATE_DIR = join(TEMPLATES_DIR, 'prompts'); -const DEFAULT_CONFIG_TEMPLATE = join(TEMPLATES_DIR, 'config', 'agentguard-swarm.default.yaml'); - -export interface ScaffoldOptions { - readonly projectRoot: string; - readonly force?: boolean; - readonly tiers?: readonly string[]; -} - -export function scaffold(options: ScaffoldOptions): ScaffoldResult { - const { projectRoot, force = false } = options; - - // Write config file if it doesn't exist - const configPath = join(projectRoot, 'agentguard-swarm.yaml'); - const configWritten = writeConfigIfMissing(configPath); - - // Load config (user's or default) - const config = loadConfig(projectRoot); - - // Apply tier override from CLI flags - const effectiveConfig: SwarmConfig = options.tiers - ? { - swarm: { - ...config.swarm, - tiers: options.tiers as SwarmConfig['swarm']['tiers'], - }, - } - : config; - - // Load manifest and filter agents - const manifest = loadManifest(); - const enabledAgents = filterAgentsByTier(manifest.agents, effectiveConfig.swarm.tiers); - const requiredSkills = collectSkills(enabledAgents); - - // Scaffold skills - const skillsDir = join(projectRoot, '.claude', 'skills'); - mkdirSync(skillsDir, { recursive: true }); - - let skillsWritten = 0; - let skillsSkipped = 0; - - // Copy all available skill templates that are needed by enabled agents - const availableTemplates = readdirSync(SKILLS_TEMPLATE_DIR).filter((f) => f.endsWith('.md')); - - for (const templateFile of availableTemplates) { - const skillName = templateFile.replace('.md', ''); - if (!requiredSkills.includes(skillName)) { - continue; - } - - const targetPath = join(skillsDir, templateFile); - if (existsSync(targetPath) && !force) { - skillsSkipped++; - continue; - } - - const templateContent = readFileSync(join(SKILLS_TEMPLATE_DIR, templateFile), 'utf8'); - const rendered = renderTemplate(templateContent, effectiveConfig); - writeFileSync(targetPath, rendered, 'utf8'); - skillsWritten++; - } - - // Also copy skills that exist as templates but aren't in any agent's skill list - // (utility skills like full-test, release-prepare, etc.) - for (const templateFile of availableTemplates) { - const skillName = templateFile.replace('.md', ''); - if (requiredSkills.includes(skillName)) { - continue; // Already handled above - } - - const targetPath = join(skillsDir, templateFile); - if (existsSync(targetPath) && !force) { - skillsSkipped++; - continue; - } - - const templateContent = readFileSync(join(SKILLS_TEMPLATE_DIR, templateFile), 'utf8'); - const rendered = renderTemplate(templateContent, effectiveConfig); - writeFileSync(targetPath, rendered, 'utf8'); - skillsWritten++; - } - - // Build agent entries with resolved prompts - const agents: ScaffoldedAgent[] = enabledAgents.map((agent) => { - const promptFile = join(PROMPTS_TEMPLATE_DIR, `${agent.promptTemplate}.md`); - const promptContent = existsSync(promptFile) ? readFileSync(promptFile, 'utf8') : ''; - const renderedPrompt = renderTemplate(promptContent, effectiveConfig); - - return { - id: agent.id, - name: agent.name, - tier: agent.tier, - cron: resolveSchedule(agent, effectiveConfig), - description: agent.description, - prompt: renderedPrompt, - }; - }); - - return { - skillsWritten, - skillsSkipped, - promptsWritten: agents.length, - configWritten, - agents, - }; -} - -function writeConfigIfMissing(configPath: string): boolean { - if (existsSync(configPath)) { - return false; - } - copyFileSync(DEFAULT_CONFIG_TEMPLATE, configPath); - return true; -} - -export function scaffoldSquad(root: string, squadName: string, _squad: Squad): void { - const squadDir = join(root, '.agentguard', 'squads', squadName); - mkdirSync(squadDir, { recursive: true }); - - // Write initial squad state if not present - const statePath = join(squadDir, 'state.json'); - if (!existsSync(statePath)) { - const initialState: SquadState = { - squad: squadName, - sprint: { goal: '', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 0, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - writeFileSync(statePath, JSON.stringify(initialState, null, 2), 'utf8'); - } - - // Write learnings store if not present - const learningsPath = join(squadDir, 'learnings.json'); - if (!existsSync(learningsPath)) { - writeFileSync(learningsPath, '[]', 'utf8'); - } -} - -function renderTemplate(content: string, config: SwarmConfig): string { - const { paths, labels } = config.swarm; - - const replacements: Record = { - 'paths.policy': paths.policy, - 'paths.roadmap': paths.roadmap, - 'paths.swarmState': paths.swarmState, - 'paths.logs': paths.logs, - 'paths.reports': paths.reports, - 'paths.swarmLogs': paths.swarmLogs, - 'paths.cli': paths.cli, - 'labels.pending': labels.pending, - 'labels.inProgress': labels.inProgress, - 'labels.review': labels.review, - 'labels.blocked': labels.blocked, - 'labels.critical': labels.critical, - 'labels.high': labels.high, - 'labels.medium': labels.medium, - 'labels.low': labels.low, - 'labels.developer': labels.developer, - 'labels.architect': labels.architect, - 'labels.auditor': labels.auditor, - }; - - let result = content; - for (const [key, value] of Object.entries(replacements)) { - result = result.replaceAll(`<%= ${key} %>`, value); - } - return result; -} diff --git a/packages/swarm/src/schema.ts b/packages/swarm/src/schema.ts deleted file mode 100644 index a31afad8..00000000 --- a/packages/swarm/src/schema.ts +++ /dev/null @@ -1,327 +0,0 @@ -// JSON Schema definitions for swarm template validation. -// Provides compile-time type alignment and runtime validation for -// swarm manifests, squad manifests, and swarm config files. - -import type { SwarmManifest, SquadManifest, SwarmConfig } from './types.js'; - -/** JSON Schema for a single SwarmAgent entry. */ -const SWARM_AGENT_SCHEMA = { - type: 'object', - required: ['id', 'name', 'tier', 'cron', 'skills', 'promptTemplate', 'description'], - additionalProperties: false, - properties: { - id: { type: 'string', minLength: 1, pattern: '^[a-z0-9][a-z0-9-]*$' }, - name: { type: 'string', minLength: 1 }, - tier: { type: 'string', enum: ['core', 'governance', 'ops', 'quality', 'marketing'] }, - cron: { type: 'string', minLength: 1 }, - skills: { type: 'array', items: { type: 'string', minLength: 1 } }, - promptTemplate: { type: 'string', minLength: 1 }, - description: { type: 'string', minLength: 1 }, - }, -} as const; - -/** JSON Schema for the swarm manifest (manifest.json). */ -export const SWARM_MANIFEST_SCHEMA = { - $schema: 'https://json-schema.org/draft/2020-12/schema', - title: 'AgentGuard Swarm Manifest', - description: 'Defines the full set of agents available in an AgentGuard swarm.', - type: 'object', - required: ['version', 'agents'], - additionalProperties: false, - properties: { - version: { type: 'string', pattern: '^\\d+\\.\\d+\\.\\d+$' }, - agents: { type: 'array', items: SWARM_AGENT_SCHEMA, minItems: 1 }, - }, -} as const; - -/** JSON Schema for a single SquadAgent entry. */ -const SQUAD_AGENT_SCHEMA = { - type: 'object', - required: ['id', 'rank', 'driver', 'model', 'cron', 'skills'], - additionalProperties: false, - properties: { - id: { type: 'string', minLength: 1 }, - rank: { - type: 'string', - enum: ['director', 'em', 'product-lead', 'architect', 'senior', 'junior', 'qa'], - }, - driver: { type: 'string', enum: ['claude-code', 'copilot-cli'] }, - model: { type: 'string', enum: ['opus', 'sonnet', 'haiku', 'copilot'] }, - cron: { type: 'string', minLength: 1 }, - skills: { type: 'array', items: { type: 'string' } }, - }, -} as const; - -/** JSON Schema for a Squad entry. */ -const SQUAD_SCHEMA = { - type: 'object', - required: ['name', 'repo', 'em', 'agents'], - additionalProperties: false, - properties: { - name: { type: 'string', minLength: 1 }, - repo: { type: 'string', minLength: 1 }, - em: SQUAD_AGENT_SCHEMA, - agents: { type: 'object', additionalProperties: SQUAD_AGENT_SCHEMA }, - }, -} as const; - -/** JSON Schema for the squad manifest (squad-manifest.yaml). */ -export const SQUAD_MANIFEST_SCHEMA = { - $schema: 'https://json-schema.org/draft/2020-12/schema', - title: 'AgentGuard Squad Manifest', - description: 'Defines squad hierarchy, agent roles, and loop guard configuration.', - type: 'object', - required: ['version', 'org', 'squads', 'loopGuards'], - additionalProperties: false, - properties: { - version: { type: 'string', pattern: '^\\d+\\.\\d+\\.\\d+$' }, - org: { - type: 'object', - required: ['director'], - additionalProperties: false, - properties: { director: SQUAD_AGENT_SCHEMA }, - }, - squads: { type: 'object', additionalProperties: SQUAD_SCHEMA }, - loopGuards: { - type: 'object', - required: ['maxOpenPRsPerSquad', 'maxRetries', 'maxBlastRadius', 'maxRunMinutes'], - additionalProperties: false, - properties: { - maxOpenPRsPerSquad: { type: 'number', minimum: 1 }, - maxRetries: { type: 'number', minimum: 0 }, - maxBlastRadius: { type: 'number', minimum: 1 }, - maxRunMinutes: { type: 'number', minimum: 1 }, - }, - }, - }, -} as const; - -/** JSON Schema for swarm config (agentguard-swarm.yaml). */ -export const SWARM_CONFIG_SCHEMA = { - $schema: 'https://json-schema.org/draft/2020-12/schema', - title: 'AgentGuard Swarm Config', - description: 'User-customizable swarm configuration for schedules, paths, and thresholds.', - type: 'object', - required: ['swarm'], - additionalProperties: false, - properties: { - swarm: { - type: 'object', - required: ['tiers', 'schedules', 'paths', 'labels', 'thresholds'], - additionalProperties: false, - properties: { - tiers: { - type: 'array', - items: { - type: 'string', - enum: ['core', 'governance', 'ops', 'quality', 'marketing'], - }, - minItems: 1, - }, - schedules: { type: 'object', additionalProperties: { type: 'string' } }, - paths: { - type: 'object', - required: ['policy', 'roadmap', 'swarmState', 'logs', 'reports', 'swarmLogs', 'cli'], - additionalProperties: false, - properties: { - policy: { type: 'string' }, - roadmap: { type: 'string' }, - swarmState: { type: 'string' }, - logs: { type: 'string' }, - reports: { type: 'string' }, - swarmLogs: { type: 'string' }, - cli: { type: 'string' }, - }, - }, - labels: { - type: 'object', - additionalProperties: { type: 'string' }, - }, - thresholds: { - type: 'object', - required: ['maxOpenPRs', 'prStaleHours', 'blastRadiusHigh'], - additionalProperties: false, - properties: { - maxOpenPRs: { type: 'number', minimum: 1 }, - prStaleHours: { type: 'number', minimum: 1 }, - blastRadiusHigh: { type: 'number', minimum: 1 }, - }, - }, - }, - }, - }, -} as const; - -// --------------------------------------------------------------------------- -// Validation -// --------------------------------------------------------------------------- - -export interface ValidationError { - readonly path: string; - readonly message: string; -} - -export interface ValidationResult { - readonly valid: boolean; - readonly errors: readonly ValidationError[]; -} - -/** - * Lightweight JSON Schema validator. Covers the subset of JSON Schema used by - * our swarm schemas (type, required, enum, minLength, minimum, pattern, - * additionalProperties, minItems). Does NOT implement the full spec — use a - * library like Ajv for that. This is intentionally zero-dependency. - */ -function validateValue( - value: unknown, - schema: Record, - path: string -): ValidationError[] { - const errors: ValidationError[] = []; - - if (schema.type === 'object') { - if (typeof value !== 'object' || value === null || Array.isArray(value)) { - errors.push({ path, message: `Expected object, got ${typeof value}` }); - return errors; - } - - const obj = value as Record; - const required = (schema.required as string[] | undefined) ?? []; - for (const key of required) { - if (!(key in obj)) { - errors.push({ path: `${path}.${key}`, message: 'Required property missing' }); - } - } - - const properties = - (schema.properties as Record> | undefined) ?? {}; - for (const [key, propSchema] of Object.entries(properties)) { - if (key in obj) { - errors.push(...validateValue(obj[key], propSchema, `${path}.${key}`)); - } - } - - if (schema.additionalProperties === false) { - for (const key of Object.keys(obj)) { - if (!(key in properties)) { - errors.push({ path: `${path}.${key}`, message: 'Unexpected additional property' }); - } - } - } else if ( - typeof schema.additionalProperties === 'object' && - schema.additionalProperties !== null - ) { - const addSchema = schema.additionalProperties as Record; - for (const [key, val] of Object.entries(obj)) { - if (!(key in properties)) { - errors.push(...validateValue(val, addSchema, `${path}.${key}`)); - } - } - } - - return errors; - } - - if (schema.type === 'array') { - if (!Array.isArray(value)) { - errors.push({ path, message: `Expected array, got ${typeof value}` }); - return errors; - } - - const minItems = (schema.minItems as number | undefined) ?? 0; - if (value.length < minItems) { - errors.push({ path, message: `Array must have at least ${minItems} items` }); - } - - if (schema.items && typeof schema.items === 'object') { - const itemSchema = schema.items as Record; - for (let i = 0; i < value.length; i++) { - errors.push(...validateValue(value[i], itemSchema, `${path}[${i}]`)); - } - } - - return errors; - } - - if (schema.type === 'string') { - if (typeof value !== 'string') { - errors.push({ path, message: `Expected string, got ${typeof value}` }); - return errors; - } - - const minLength = (schema.minLength as number | undefined) ?? 0; - if (value.length < minLength) { - errors.push({ path, message: `String must be at least ${minLength} characters` }); - } - - if (schema.pattern) { - const regex = new RegExp(schema.pattern as string); - if (!regex.test(value)) { - errors.push({ path, message: `String does not match pattern ${schema.pattern}` }); - } - } - - if (schema.enum && Array.isArray(schema.enum)) { - if (!schema.enum.includes(value)) { - errors.push({ - path, - message: `Value must be one of: ${(schema.enum as string[]).join(', ')}`, - }); - } - } - - return errors; - } - - if (schema.type === 'number') { - if (typeof value !== 'number') { - errors.push({ path, message: `Expected number, got ${typeof value}` }); - return errors; - } - - if (schema.minimum !== undefined && value < (schema.minimum as number)) { - errors.push({ path, message: `Value must be >= ${schema.minimum}` }); - } - } - - return errors; -} - -/** Validate a swarm manifest object against its schema. */ -export function validateSwarmManifest(manifest: unknown): ValidationResult { - const errors = validateValue( - manifest, - SWARM_MANIFEST_SCHEMA as unknown as Record, - '$' - ); - return { valid: errors.length === 0, errors }; -} - -/** Validate a squad manifest object against its schema. */ -export function validateSquadManifest(manifest: unknown): ValidationResult { - const errors = validateValue( - manifest, - SQUAD_MANIFEST_SCHEMA as unknown as Record, - '$' - ); - return { valid: errors.length === 0, errors }; -} - -/** Validate a swarm config object against its schema. */ -export function validateSwarmConfig(config: unknown): ValidationResult { - const errors = validateValue( - config, - SWARM_CONFIG_SCHEMA as unknown as Record, - '$' - ); - return { valid: errors.length === 0, errors }; -} - -// Type-level assertions to ensure schemas stay aligned with TypeScript interfaces. -// These are compile-time only — no runtime cost. -type _AssertManifest = SwarmManifest; -type _AssertSquadManifest = SquadManifest; -type _AssertConfig = SwarmConfig; -void (0 as unknown as _AssertManifest); -void (0 as unknown as _AssertSquadManifest); -void (0 as unknown as _AssertConfig); diff --git a/packages/swarm/src/squad-manifest.ts b/packages/swarm/src/squad-manifest.ts deleted file mode 100644 index 12a0d65e..00000000 --- a/packages/swarm/src/squad-manifest.ts +++ /dev/null @@ -1,75 +0,0 @@ -import { parse } from 'yaml'; -import type { SquadManifest, Squad, SquadAgent, LoopGuardConfig } from './types.js'; - -export function loadSquadManifest(yamlContent: string): SquadManifest { - const raw = parse(yamlContent) as Record; - - const org = raw.org as Record; - const director = parseAgent(org.director as Record); - - const rawSquads = raw.squads as Record>; - const squads: Record = {}; - - for (const [name, rawSquad] of Object.entries(rawSquads)) { - const em = parseAgent(rawSquad.em as Record); - const rawAgents = rawSquad.agents as Record>; - const agents: Record = {}; - for (const [role, rawAgent] of Object.entries(rawAgents)) { - agents[role] = parseAgent(rawAgent); - } - squads[name] = { - name, - repo: rawSquad.repo as string, - em, - agents, - }; - } - - const rawGuards = raw.loopGuards as Record; - const loopGuards: LoopGuardConfig = { - maxOpenPRsPerSquad: rawGuards.maxOpenPRsPerSquad ?? 3, - maxRetries: rawGuards.maxRetries ?? 3, - maxBlastRadius: rawGuards.maxBlastRadius ?? 20, - maxRunMinutes: rawGuards.maxRunMinutes ?? 10, - }; - - return { - version: raw.version as string, - org: { director }, - squads, - loopGuards, - }; -} - -function parseAgent(raw: Record): SquadAgent { - return { - id: raw.id as string, - rank: raw.rank as SquadAgent['rank'], - driver: raw.driver as SquadAgent['driver'], - model: raw.model as SquadAgent['model'], - cron: raw.cron as string, - skills: (raw.skills as string[]) ?? [], - }; -} - -/** Build the 4-part identity string: driver:model:squad:rank */ -export function buildAgentIdentity(agent: SquadAgent, squadName: string): string { - return `${agent.driver}:${agent.model}:${squadName}:${agent.rank}`; -} - -/** Parse a 4-part identity string back into components */ -export function parseAgentIdentity(identity: string): { - driver: string; - model: string; - squad: string; - rank: string; -} | null { - const parts = identity.split(':'); - if (parts.length < 4) return null; - return { - driver: parts[0], - model: parts[1], - squad: parts[2], - rank: parts[3], - }; -} diff --git a/packages/swarm/src/squad-state.ts b/packages/swarm/src/squad-state.ts deleted file mode 100644 index 1e47de16..00000000 --- a/packages/swarm/src/squad-state.ts +++ /dev/null @@ -1,59 +0,0 @@ -import { readFileSync, writeFileSync, existsSync, mkdirSync } from 'node:fs'; -import { join } from 'node:path'; -import type { SquadState, EMReport, DirectorBrief } from './types.js'; - -function squadDir(root: string, squad: string): string { - return join(root, '.agentguard', 'squads', squad); -} - -function ensureDir(dir: string): void { - if (!existsSync(dir)) mkdirSync(dir, { recursive: true }); -} - -export function readSquadState(root: string, squad: string): SquadState | null { - const path = join(squadDir(root, squad), 'state.json'); - if (!existsSync(path)) return null; - try { - return JSON.parse(readFileSync(path, 'utf8')) as SquadState; - } catch { - return null; - } -} - -export function writeSquadState(root: string, squad: string, state: SquadState): void { - const dir = squadDir(root, squad); - ensureDir(dir); - writeFileSync(join(dir, 'state.json'), JSON.stringify(state, null, 2), 'utf8'); -} - -export function readEMReport(root: string, squad: string): EMReport | null { - const path = join(squadDir(root, squad), 'em-report.json'); - if (!existsSync(path)) return null; - try { - return JSON.parse(readFileSync(path, 'utf8')) as EMReport; - } catch { - return null; - } -} - -export function writeEMReport(root: string, squad: string, report: EMReport): void { - const dir = squadDir(root, squad); - ensureDir(dir); - writeFileSync(join(dir, 'em-report.json'), JSON.stringify(report, null, 2), 'utf8'); -} - -export function readDirectorBrief(root: string): DirectorBrief | null { - const path = join(root, '.agentguard', 'director-brief.json'); - if (!existsSync(path)) return null; - try { - return JSON.parse(readFileSync(path, 'utf8')) as DirectorBrief; - } catch { - return null; - } -} - -export function writeDirectorBrief(root: string, brief: DirectorBrief): void { - const dir = join(root, '.agentguard'); - ensureDir(dir); - writeFileSync(join(dir, 'director-brief.json'), JSON.stringify(brief, null, 2), 'utf8'); -} diff --git a/packages/swarm/src/types.ts b/packages/swarm/src/types.ts deleted file mode 100644 index 81dab3df..00000000 --- a/packages/swarm/src/types.ts +++ /dev/null @@ -1,166 +0,0 @@ -export interface SwarmAgent { - readonly id: string; - readonly name: string; - readonly tier: SwarmTier; - readonly cron: string; - readonly skills: readonly string[]; - readonly promptTemplate: string; - readonly description: string; -} - -export type SwarmTier = 'core' | 'governance' | 'ops' | 'quality' | 'marketing'; - -export interface SwarmManifest { - readonly version: string; - readonly agents: readonly SwarmAgent[]; -} - -export interface SwarmPaths { - readonly policy: string; - readonly roadmap: string; - readonly swarmState: string; - readonly logs: string; - readonly reports: string; - readonly swarmLogs: string; - readonly cli: string; -} - -export interface SwarmLabels { - readonly pending: string; - readonly inProgress: string; - readonly review: string; - readonly blocked: string; - readonly critical: string; - readonly high: string; - readonly medium: string; - readonly low: string; - readonly developer: string; - readonly architect: string; - readonly auditor: string; -} - -export interface SwarmThresholds { - readonly maxOpenPRs: number; - readonly prStaleHours: number; - readonly blastRadiusHigh: number; -} - -export interface SwarmConfig { - readonly swarm: { - readonly tiers: readonly SwarmTier[]; - readonly schedules: Readonly>; - readonly paths: SwarmPaths; - readonly labels: SwarmLabels; - readonly thresholds: SwarmThresholds; - }; -} - -export interface ScaffoldResult { - readonly skillsWritten: number; - readonly skillsSkipped: number; - readonly promptsWritten: number; - readonly configWritten: boolean; - readonly agents: readonly ScaffoldedAgent[]; -} - -export interface ScaffoldedAgent { - readonly id: string; - readonly name: string; - readonly tier: SwarmTier; - readonly cron: string; - readonly description: string; - readonly prompt: string; -} - -// --- Squad hierarchy types --- - -export type SquadRank = - | 'director' - | 'em' - | 'product-lead' - | 'architect' - | 'senior' - | 'junior' - | 'qa'; -export type AgentDriver = 'claude-code' | 'copilot-cli'; -export type AgentModel = 'opus' | 'sonnet' | 'haiku' | 'copilot'; - -export interface SquadAgent { - readonly id: string; - readonly rank: SquadRank; - readonly driver: AgentDriver; - readonly model: AgentModel; - readonly cron: string; - readonly skills: readonly string[]; -} - -export interface Squad { - readonly name: string; - readonly repo: string; // repo name or '*' for cross-repo - readonly em: SquadAgent; - readonly agents: Readonly>; -} - -export interface SquadManifest { - readonly version: string; - readonly org: { - readonly director: SquadAgent; - }; - readonly squads: Readonly>; - readonly loopGuards: LoopGuardConfig; -} - -export interface LoopGuardConfig { - readonly maxOpenPRsPerSquad: number; - readonly maxRetries: number; - readonly maxBlastRadius: number; - readonly maxRunMinutes: number; -} - -export interface SquadState { - readonly squad: string; - readonly sprint: { - readonly goal: string; - readonly issues: readonly string[]; - }; - readonly assignments: Readonly< - Record< - string, - { - readonly current: string | null; - readonly status: string; - readonly waiting?: string; - } - > - >; - readonly blockers: readonly string[]; - readonly prQueue: { - readonly open: number; - readonly reviewed: number; - readonly mergeable: number; - }; - readonly updatedAt: string; -} - -export interface EMReport { - readonly squad: string; - readonly timestamp: string; - readonly health: 'green' | 'yellow' | 'red'; - readonly summary: string; - readonly blockers: readonly string[]; - readonly escalations: readonly string[]; - readonly metrics: { - readonly prsOpened: number; - readonly prsMerged: number; - readonly issuesClosed: number; - readonly denials: number; - readonly retries: number; - }; -} - -export interface DirectorBrief { - readonly timestamp: string; - readonly squads: Readonly>; - readonly escalationsForHuman: readonly string[]; - readonly overallHealth: 'green' | 'yellow' | 'red'; -} diff --git a/packages/swarm/templates/config/agentguard-swarm.default.yaml b/packages/swarm/templates/config/agentguard-swarm.default.yaml deleted file mode 100644 index 554dd2af..00000000 --- a/packages/swarm/templates/config/agentguard-swarm.default.yaml +++ /dev/null @@ -1,51 +0,0 @@ -# AgentGuard Swarm Configuration -# Generated by: agentguard init swarm -# Customize values below to adapt the swarm to your project. - -swarm: - # Which agent tiers to install. - # core: coder, reviewer, merger, CI triage, branch janitor, merge conflict resolver, PR review responder - # governance: risk escalation, recovery controller, governance monitor, policy effectiveness - # ops: planning, observability, retrospective, docs sync, product health, progress controller, backlog steward, repo hygiene - # quality: test agent, security audit, architecture review, CI/CD hardening, test generation, audit merged PRs - # marketing: marketing content generator - tiers: - - core - - governance - - ops - - quality - - marketing - - # Override cron schedules per agent (local time, 5-field cron). - # Agents not listed here use the default from manifest.json. - schedules: {} - - # Project-specific paths. - paths: - policy: agentguard.yaml - roadmap: ROADMAP.md - swarmState: .agentguard/swarm-state.json - logs: logs/runtime-events.jsonl - reports: .agentguard/reports - swarmLogs: .agentguard/logs/swarm.log - cli: node apps/cli/dist/bin.js - - # GitHub label scheme used by the swarm for issue/PR management. - labels: - pending: 'status:pending' - inProgress: 'status:in-progress' - review: 'status:review' - blocked: 'status:blocked' - critical: 'priority:critical' - high: 'priority:high' - medium: 'priority:medium' - low: 'priority:low' - developer: 'role:developer' - architect: 'role:architect' - auditor: 'role:auditor' - - # Thresholds for swarm behavior. - thresholds: - maxOpenPRs: 5 - prStaleHours: 48 - blastRadiusHigh: 16 diff --git a/packages/swarm/templates/config/squad-manifest.default.yaml b/packages/swarm/templates/config/squad-manifest.default.yaml deleted file mode 100644 index f48377da..00000000 --- a/packages/swarm/templates/config/squad-manifest.default.yaml +++ /dev/null @@ -1,206 +0,0 @@ -version: "1.0.0" - -org: - director: - id: director - rank: director - driver: claude-code - model: opus - cron: "0 7,19 * * *" - skills: [squad-status, director-brief, escalation-router] - -squads: - kernel: - name: Kernel - repo: agent-guard - em: - id: kernel-em - rank: em - driver: claude-code - model: opus - cron: "0 */3 * * *" - skills: [squad-plan, squad-execute, squad-status, squad-retro, escalation-router] - agents: - product-lead: - id: kernel-pl - rank: product-lead - driver: claude-code - model: sonnet - cron: "0 6 * * *" - skills: [sprint-planning, roadmap-expand, backlog-steward, learn] - architect: - id: kernel-arch - rank: architect - driver: claude-code - model: opus - cron: "0 */4 * * *" - skills: [architecture-review, review-open-prs, eval, evolve] - senior: - id: kernel-sr - rank: senior - driver: copilot-cli - model: sonnet - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, create-pr, run-tests] - junior: - id: kernel-jr - rank: junior - driver: copilot-cli - model: copilot - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, run-tests, generate-tests] - qa: - id: kernel-qa - rank: qa - driver: copilot-cli - model: sonnet - cron: "0 */3 * * *" - skills: [e2e-testing, compliance-test, test-health-review, learn, prune] - - cloud: - name: Cloud - repo: agentguard-cloud - em: - id: cloud-em - rank: em - driver: claude-code - model: opus - cron: "0 */3 * * *" - skills: [squad-plan, squad-execute, squad-status, squad-retro, escalation-router] - agents: - product-lead: - id: cloud-pl - rank: product-lead - driver: claude-code - model: sonnet - cron: "0 6 * * *" - skills: [sprint-planning, roadmap-expand, backlog-steward, learn] - architect: - id: cloud-arch - rank: architect - driver: claude-code - model: opus - cron: "0 */4 * * *" - skills: [architecture-review, review-open-prs, eval, evolve] - senior: - id: cloud-sr - rank: senior - driver: copilot-cli - model: sonnet - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, create-pr, run-tests] - junior: - id: cloud-jr - rank: junior - driver: copilot-cli - model: copilot - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, run-tests, generate-tests] - qa: - id: cloud-qa - rank: qa - driver: copilot-cli - model: sonnet - cron: "0 */3 * * *" - skills: [e2e-testing, compliance-test, test-health-review, learn, prune] - - qa: - name: QA - repo: "*" - em: - id: qa-em - rank: em - driver: claude-code - model: sonnet - cron: "0 */3 * * *" - skills: [squad-plan, squad-execute, squad-status, squad-retro, escalation-router] - agents: - product-lead: - id: qa-pl - rank: product-lead - driver: claude-code - model: sonnet - cron: "0 6 * * *" - skills: [sprint-planning, test-strategy, stranger-test-plan, learn] - architect: - id: qa-arch - rank: architect - driver: claude-code - model: sonnet - cron: "0 */4 * * *" - skills: [test-architecture, compliance-review, eval, evolve] - senior: - id: qa-sr - rank: senior - driver: copilot-cli - model: sonnet - cron: "0 */2 * * *" - skills: [playwright-e2e, stranger-test-run, compliance-test, create-pr] - junior: - id: qa-jr - rank: junior - driver: copilot-cli - model: copilot - cron: "0 */2 * * *" - skills: [generate-tests, run-tests, test-data-generation] - qa: - id: qa-qa - rank: qa - driver: copilot-cli - model: haiku - cron: "0 */1 * * *" - skills: [e2e-testing, regression-analysis, flakiness-detection, learn, prune] - - - studio: - name: Studio - repo: agentguard-workspace - em: - id: studio-em - rank: em - driver: claude-code - model: opus - cron: "0 */3 * * *" - skills: [squad-plan, squad-execute, squad-status, squad-retro, escalation-router] - agents: - product-lead: - id: studio-pl - rank: product-lead - driver: claude-code - model: sonnet - cron: "0 6 * * *" - skills: [sprint-planning, roadmap-expand, backlog-steward, learn] - architect: - id: studio-arch - rank: architect - driver: claude-code - model: opus - cron: "0 */4 * * *" - skills: [architecture-review, review-open-prs, eval, evolve] - senior: - id: studio-sr - rank: senior - driver: copilot-cli - model: sonnet - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, create-pr, run-tests] - junior: - id: studio-jr - rank: junior - driver: copilot-cli - model: copilot - cron: "0 */2 * * *" - skills: [claim-issue, implement-issue, run-tests, generate-tests] - qa: - id: studio-qa - rank: qa - driver: copilot-cli - model: sonnet - cron: "0 */3 * * *" - skills: [e2e-testing, compliance-test, test-health-review, learn, prune] - -loopGuards: - maxOpenPRsPerSquad: 3 - maxRetries: 3 - maxBlastRadius: 20 - maxRunMinutes: 10 diff --git a/packages/swarm/templates/prompts/architect-agent.md b/packages/swarm/templates/prompts/architect-agent.md deleted file mode 100644 index 3f09a642..00000000 --- a/packages/swarm/templates/prompts/architect-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Architect Agent for this repository. You perform architecture reviews of open pull requests. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `architecture-review` — Review open PRs for architectural consistency, pattern adherence, and design quality - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/audit-merged-prs-agent.md b/packages/swarm/templates/prompts/audit-merged-prs-agent.md deleted file mode 100644 index bd913763..00000000 --- a/packages/swarm/templates/prompts/audit-merged-prs-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Audit Merged PRs Agent for this repository. You audit recently merged pull requests for compliance and quality. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `audit-merged-prs` — Review recently merged PRs for governance compliance and quality issues - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/backlog-steward.md b/packages/swarm/templates/prompts/backlog-steward.md deleted file mode 100644 index 720c97f0..00000000 --- a/packages/swarm/templates/prompts/backlog-steward.md +++ /dev/null @@ -1,19 +0,0 @@ -You are the Backlog Steward for this repository. You manage the backlog and expand the roadmap with new issues. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `backlog-steward` — Groom and organize the issue backlog -3. `roadmap-expand` — Identify gaps and create new issues to expand the roadmap - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/ci-triage-agent.md b/packages/swarm/templates/prompts/ci-triage-agent.md deleted file mode 100644 index 08d30a4e..00000000 --- a/packages/swarm/templates/prompts/ci-triage-agent.md +++ /dev/null @@ -1,23 +0,0 @@ -You are the CI Triage Agent for this repository. You fix failing CI on open PR branches. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Pre-flight Check - -If there are no open PRs with failing CI runs, report "No failing CI — skipping this run" and STOP. - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `sync-main` — Sync local main branch with remote -3. `triage-failing-ci` — Diagnose and fix failing CI on open PR branches - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/cicd-hardening-agent.md b/packages/swarm/templates/prompts/cicd-hardening-agent.md deleted file mode 100644 index 06f4a788..00000000 --- a/packages/swarm/templates/prompts/cicd-hardening-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the CI/CD Hardening Agent for this repository. You audit CI/CD pipelines for security, reliability, and performance. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `cicd-hardening-audit` — Audit CI/CD workflows for hardening opportunities - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/code-review-agent.md b/packages/swarm/templates/prompts/code-review-agent.md deleted file mode 100644 index f48a68ce..00000000 --- a/packages/swarm/templates/prompts/code-review-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Code Review Agent for this repository. You review open pull requests for correctness, style, and safety. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `review-open-prs` — Review all open pull requests that need review - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/coder-agent.md b/packages/swarm/templates/prompts/coder-agent.md deleted file mode 100644 index 86f503f4..00000000 --- a/packages/swarm/templates/prompts/coder-agent.md +++ /dev/null @@ -1,33 +0,0 @@ -You are the Coder Agent for this repository. You pick up issues, implement them, and open pull requests. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Pre-flight Check - -Before starting, check the PR queue: - -```bash -cat .agentguard/swarm-state.json 2>/dev/null -``` - -If `prQueueHealthy` is `false` or `openAgentPRs >= 5`, report "PR queue full — skipping this run" and STOP. - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `sync-main` — Sync local main branch with remote -3. `discover-next-issue` — Find the next unassigned issue to work on -4. `claim-issue` — Claim the discovered issue so no other agent picks it up -5. `implement-issue` — Implement the solution on a feature branch -6. `run-tests` — Run the test suite and fix any failures -7. `create-pr` — Open a pull request for the implementation - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/docs-sync-agent.md b/packages/swarm/templates/prompts/docs-sync-agent.md deleted file mode 100644 index 4d55be64..00000000 --- a/packages/swarm/templates/prompts/docs-sync-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Docs Sync Agent for this repository. You keep documentation in sync with the codebase. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `scheduled-docs-sync` — Scan for documentation drift and update docs to match current code - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/governance-monitor.md b/packages/swarm/templates/prompts/governance-monitor.md deleted file mode 100644 index 8bc3e470..00000000 --- a/packages/swarm/templates/prompts/governance-monitor.md +++ /dev/null @@ -1,19 +0,0 @@ -You are the Governance Monitor for this repository. You audit governance logs and review policy effectiveness. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `governance-log-audit` — Audit recent governance logs for anomalies or violations -3. `policy-effectiveness-review` — Review policy effectiveness and suggest improvements - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/infrastructure-health-agent.md b/packages/swarm/templates/prompts/infrastructure-health-agent.md deleted file mode 100644 index b02f9708..00000000 --- a/packages/swarm/templates/prompts/infrastructure-health-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Infrastructure Health Agent for this repository. You monitor SDLC pipeline health and reliability. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `sdlc-pipeline-health` — Assess SDLC pipeline health, build times, and reliability metrics - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/marketing-content-agent.md b/packages/swarm/templates/prompts/marketing-content-agent.md deleted file mode 100644 index 02f98e89..00000000 --- a/packages/swarm/templates/prompts/marketing-content-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Marketing Content Agent for this repository. You generate marketing content based on recent project activity. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `marketing-content` — Generate marketing content from recent project milestones and features - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/merge-conflict-resolver.md b/packages/swarm/templates/prompts/merge-conflict-resolver.md deleted file mode 100644 index f9698089..00000000 --- a/packages/swarm/templates/prompts/merge-conflict-resolver.md +++ /dev/null @@ -1,22 +0,0 @@ -You are the Merge Conflict Resolver for this repository. You rebase PRs that have merge conflicts against main. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Constraints - -Process at most 1 PR per run. If multiple PRs have conflicts, pick the oldest one. - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `resolve-merge-conflicts` — Find a PR with merge conflicts, rebase it, and push - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/observability-agent.md b/packages/swarm/templates/prompts/observability-agent.md deleted file mode 100644 index 35a7bd34..00000000 --- a/packages/swarm/templates/prompts/observability-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Observability Agent for this repository. You perform SRE-style analysis of system health and reliability. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `observability-review` — Analyze runtime telemetry, error rates, and system health - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/planning-agent.md b/packages/swarm/templates/prompts/planning-agent.md deleted file mode 100644 index ecc42e60..00000000 --- a/packages/swarm/templates/prompts/planning-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Planning Agent for this repository. You perform sprint planning by analyzing the backlog and prioritizing work. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `sprint-planning` — Analyze the backlog, prioritize issues, and plan the next sprint - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/pr-merger-agent.md b/packages/swarm/templates/prompts/pr-merger-agent.md deleted file mode 100644 index ab64951c..00000000 --- a/packages/swarm/templates/prompts/pr-merger-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the PR Merger Agent for this repository. You auto-merge approved pull requests that have passing CI. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `pr-merger` — Find and merge approved PRs with passing CI checks - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/pr-review-responder.md b/packages/swarm/templates/prompts/pr-review-responder.md deleted file mode 100644 index c6e7af19..00000000 --- a/packages/swarm/templates/prompts/pr-review-responder.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the PR Review Responder for this repository. You respond to review comments on agent-authored pull requests. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `respond-to-pr-reviews` — Address review comments on agent PRs with code changes or replies - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/product-agent.md b/packages/swarm/templates/prompts/product-agent.md deleted file mode 100644 index fb196283..00000000 --- a/packages/swarm/templates/prompts/product-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Product Agent for this repository. You perform product health reviews to assess feature completeness and quality. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `product-health-review` — Assess product health, feature completeness, and quality metrics - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/progress-controller.md b/packages/swarm/templates/prompts/progress-controller.md deleted file mode 100644 index 7c502d7a..00000000 --- a/packages/swarm/templates/prompts/progress-controller.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Progress Controller for this repository. You track roadmap phase progress and update milestones. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `progress-controller` — Track roadmap phase progress, update milestones, and flag blockers - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/recovery-controller.md b/packages/swarm/templates/prompts/recovery-controller.md deleted file mode 100644 index 710daf09..00000000 --- a/packages/swarm/templates/prompts/recovery-controller.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Recovery Controller for this repository. You perform self-healing checks on swarm health and recover from failures. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `recovery-controller` — Assess swarm health and recover from stuck or failed states - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/repo-hygiene-agent.md b/packages/swarm/templates/prompts/repo-hygiene-agent.md deleted file mode 100644 index 855b3e0c..00000000 --- a/packages/swarm/templates/prompts/repo-hygiene-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Repo Hygiene Agent for this repository. You manage stale issues and close solved issues. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `repo-hygiene` — Identify stale issues, close solved issues, and clean up the issue tracker - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/retrospective-agent.md b/packages/swarm/templates/prompts/retrospective-agent.md deleted file mode 100644 index 748b7e3a..00000000 --- a/packages/swarm/templates/prompts/retrospective-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Retrospective Agent for this repository. You run weekly retrospectives on swarm performance. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `retrospective` — Analyze the past week of swarm activity and produce a retrospective report - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/risk-escalation-agent.md b/packages/swarm/templates/prompts/risk-escalation-agent.md deleted file mode 100644 index 920d8f8e..00000000 --- a/packages/swarm/templates/prompts/risk-escalation-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Risk Escalation Agent for this repository. You assess cumulative risk across the swarm and escalate when thresholds are exceeded. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `risk-escalation` — Assess cumulative risk from recent governance sessions and escalate if needed - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/security-audit-agent.md b/packages/swarm/templates/prompts/security-audit-agent.md deleted file mode 100644 index 1c2ecfa4..00000000 --- a/packages/swarm/templates/prompts/security-audit-agent.md +++ /dev/null @@ -1,19 +0,0 @@ -You are the Security Audit Agent for this repository. You scan dependencies and code for security vulnerabilities. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `dependency-security-audit` — Audit dependencies for known vulnerabilities -3. `security-code-scan` — Scan source code for security anti-patterns and vulnerabilities - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/stale-branch-janitor.md b/packages/swarm/templates/prompts/stale-branch-janitor.md deleted file mode 100644 index 7439f8aa..00000000 --- a/packages/swarm/templates/prompts/stale-branch-janitor.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Stale Branch Janitor for this repository. You clean up stale branches and abandoned PRs. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `stale-branch-janitor` — Identify and clean up stale branches and abandoned PRs - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/test-agent.md b/packages/swarm/templates/prompts/test-agent.md deleted file mode 100644 index 313f548b..00000000 --- a/packages/swarm/templates/prompts/test-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Test Agent for this repository. You review test health and coverage. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `test-health-review` — Analyze test coverage, flaky tests, and overall test suite health - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/prompts/test-generation-agent.md b/packages/swarm/templates/prompts/test-generation-agent.md deleted file mode 100644 index 2e28680c..00000000 --- a/packages/swarm/templates/prompts/test-generation-agent.md +++ /dev/null @@ -1,18 +0,0 @@ -You are the Test Generation Agent for this repository. You generate tests for untested or under-tested modules. - -## Autonomy Directive - -This is an unattended scheduled task. No human is present. - -- NEVER pause to ask for clarification — make your best judgment and proceed -- NEVER use AskUserQuestion or any interactive prompt -- Default to the safest option in every ambiguous situation - -## Task - -Execute these skills in order: - -1. `start-governance-runtime` — Start the governance kernel -2. `generate-tests` — Identify untested modules and generate test files for them - -If any skill reports STOP, end the run and report why. diff --git a/packages/swarm/templates/skills/architecture-review.md b/packages/swarm/templates/skills/architecture-review.md deleted file mode 100644 index c59bb60b..00000000 --- a/packages/swarm/templates/skills/architecture-review.md +++ /dev/null @@ -1,158 +0,0 @@ -# Skill: Architecture Review - -Review open PRs for architectural concerns: module boundary violations, dependency direction, cross-layer coupling, and consistency with the unified architecture. Complements the `review-open-prs` skill with deeper structural analysis. Designed for periodic scheduled execution. - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active. If governance cannot be activated, STOP. - -### 2. List Open PRs - -```bash -gh pr list --state open --json number,title,headRefName,additions,deletions --limit 10 -``` - -If no open PRs exist, report "No open PRs to review" and STOP. - -### 3. Filter PRs Needing Architecture Review - -Select PRs that touch structural files (skip docs-only, config-only, or test-only PRs): - -```bash -gh pr view --json files --jq '.files[].path' -``` - -A PR needs architecture review if it modifies files in: -- `packages/kernel/src/` — core governance engine -- `packages/events/src/` — canonical event model -- `packages/policy/src/` — policy system -- `packages/invariants/src/` — invariant system -- `packages/adapters/src/` — execution adapters -- `packages/core/src/` — shared types and utilities -- `apps/cli/src/` — CLI entry points and commands - -Skip PRs that already have an `**Architect Agent**` comment. Select up to **2 PRs** per run. - -### 4. Review Each PR - -For each selected PR: - -#### 4a. Read the Diff - -```bash -gh pr diff -``` - -#### 4b. Analyze Module Boundaries - -The architecture defines 7 distinct workspace packages with strict dependency rules: - -``` -@red-codes/core ← (shared types, no imports from other packages) - ↑ -@red-codes/events ← (may import from core) - ↑ -@red-codes/policy ← (may import from core) - ↑ -@red-codes/invariants ← (may import from core, events) - ↑ -@red-codes/kernel ← (may import from core, events, policy, invariants) - ↑ -@red-codes/adapters ← (may import from core, events, kernel) - ↑ -apps/cli ← (may import from anything) -``` - -Check the diff for import statements that violate these dependency rules: -- `@red-codes/kernel` must NOT import from `@red-codes/adapters` or `apps/cli` -- `@red-codes/adapters` must NOT import from `apps/cli` -- `@red-codes/events` must NOT import from `kernel`, `policy`, `invariants`, `adapters`, or `cli` -- `@red-codes/policy` must NOT import from `kernel`, `invariants`, `adapters`, or `cli` -- `@red-codes/core` must NOT import from any other workspace package - -#### 4c. Check Event Model Consistency - -If the PR adds new event kinds: -- New events must be defined in `packages/events/src/schema.ts` -- New events must follow the existing naming convention (PascalCase) -- New events must have a factory function for creation -- New events must be documented in the appropriate event category - -#### 4d. Check Action Type Consistency - -If the PR adds new action types: -- New actions must be registered in `packages/core/src/actions.ts` -- New actions must follow the `class.verb` naming convention (e.g., `file.read`, `git.push`) -- New action classes must have a corresponding adapter in `packages/adapters/src/` - -#### 4e. Check Public API Surface - -If the PR modifies exports from barrel files (`index.ts`): -- Removing exports is a breaking change — flag it -- Adding exports should be intentional, not accidental - -#### 4f. Assess Coupling - -Analyze the changed files for coupling concerns: -- Does the change introduce circular dependencies? -- Does the change add imports from multiple unrelated layers? -- Does the change leak implementation details across layer boundaries? -- Could the change be implemented with fewer cross-layer imports? - -### 5. Post Architecture Review - -For each reviewed PR, post a structured comment: - -```bash -gh pr comment --body "**Architect Agent** — architecture review - -## Module Boundary Analysis - -| Layer | Files Changed | Boundary Status | -|-------|--------------|----------------| -| kernel/ | N | CLEAN/VIOLATION | -| events/ | N | CLEAN/VIOLATION | -| policy/ | N | CLEAN/VIOLATION | -| invariants/ | N | CLEAN/VIOLATION | -| adapters/ | N | CLEAN/VIOLATION | -| cli/ | N | CLEAN/VIOLATION | -| core/ | N | CLEAN/VIOLATION | - -## Findings - -| # | Severity | Category | Details | -|---|----------|----------|---------| -| 1 | | | | - -## Recommendations - - - ---- -*Architecture review by Architect Agent on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" -``` - -### 6. Summary - -Report: -- **PRs reviewed**: N (list PR numbers) -- **Boundary violations found**: N -- **Coupling concerns**: N -- **API surface changes**: N -- If all clean: "Architecture review passed — no structural concerns" - -## Rules - -- Review a maximum of **2 PRs per run** (architecture review is deeper than code review). -- **Never approve or merge PRs** — post informational comments only. -- **Never modify PR code** — review is read-only. -- Skip PRs that already have an `**Architect Agent**` comment. -- Skip docs-only, config-only, and test-only PRs — they don't affect architecture. -- Focus on structural concerns, not coding style (that is the Reviewer Agent's job). -- If `gh` CLI is not authenticated, report the error and STOP. diff --git a/packages/swarm/templates/skills/audit-merged-prs.md b/packages/swarm/templates/skills/audit-merged-prs.md deleted file mode 100644 index 6a225871..00000000 --- a/packages/swarm/templates/skills/audit-merged-prs.md +++ /dev/null @@ -1,253 +0,0 @@ -# Skill: Audit Merged PRs - -Audit pull requests merged in the last 7 days for risks that may have been overlooked — unresolved review comments, dismissed change requests, bypassed CI, or governance violations. Creates a consolidated risk report as a GitHub issue. Designed for weekly scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If risk classification is ambiguous, round **up** to the higher risk level (err on the side of caution) -- If governance activation fails, log the failure and **STOP** — do not ask what to do -- If `gh` CLI fails, log the error and **STOP** — do not ask for credentials -- If a PR's data is incomplete or malformed, **skip that PR** and note it in the summary -- Default to the **safest option** in every ambiguous situation (flag risk > ignore risk) -- When in doubt about any decision, choose the conservative path and document why in the summary - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. Requires `gh` CLI authenticated with repo access. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Ensure Labels Exist - -```bash -gh label create "audit" --color "FBCA04" --description "Post-merge risk audit finding" 2>/dev/null || true -gh label create "source:merged-pr-audit" --color "C5DEF5" --description "Auto-created by Merged PR Audit skill" 2>/dev/null || true -gh label create "<%= labels.high %>" --color "D93F0B" --description "High priority" 2>/dev/null || true -gh label create "<%= labels.medium %>" --color "FBCA04" --description "Medium priority" 2>/dev/null || true -``` - -### 3. List Recently Merged PRs - -```bash -gh pr list --state merged --json number,title,mergedAt,author,mergedBy,headRefName,additions,deletions,labels --limit 50 -``` - -Filter results: -- **Include**: PRs where `mergedAt` is within the last 7 days -- **Exclude**: PRs already audited (check for a comment by `**AgentGuard Merged PR Audit Bot**`) - -If no recently merged PRs exist, report "No recently merged PRs to audit" and STOP. - -### 4. Audit Each Merged PR - -For each merged PR, collect evidence across five risk dimensions: - -#### 4a. Review Comment Analysis - -Check for unresolved review threads at time of merge: - -```bash -gh pr view --json reviewThreads --jq '[.reviewThreads[] | select(.isResolved == false)] | length' -``` - -Check for change-request reviews that were never addressed: - -```bash -gh pr view --json reviews --jq '[.reviews[] | select(.state == "CHANGES_REQUESTED")] | length' -``` - -Check if change-request reviews were followed by approvals or remained outstanding: - -```bash -gh pr view --json reviews --jq '[.reviews[] | {author: .author.login, state: .state, submittedAt: .submittedAt}]' -``` - -**Risk signals**: -- Unresolved review threads at merge → **HIGH** -- Outstanding CHANGES_REQUESTED with no subsequent approval by the same reviewer → **HIGH** -- Security-related keywords in dismissed/unresolved comments (e.g., "vulnerability", "injection", "auth", "secret", "credential", "XSS", "CSRF", "sanitize") → **CRITICAL** - -#### 4b. CI Status at Merge - -```bash -gh pr view --json statusCheckRollup --jq '[.statusCheckRollup[] | select(.status != "COMPLETED" or .conclusion != "SUCCESS")]' -``` - -**Risk signals**: -- Any required check not passing at merge time → **CRITICAL** -- CI checks skipped entirely (no check runs at all) → **HIGH** -- Only optional/non-required checks failing → **LOW** - -#### 4c. Governance Report Analysis - -Read the PR body and look for the `## Governance Report` section: - -```bash -gh pr view --json body --jq '.body' -``` - -Parse the governance report for: -- `PolicyDenied` count > 0 → **MEDIUM** -- `InvariantViolation` count > 0 → **HIGH** -- `ActionDenied` count > 0 → **MEDIUM** -- No governance report at all (for agent-authored PRs) → **MEDIUM** - -#### 4d. Size and Scope Assessment - -```bash -gh pr view --json additions,deletions,files --jq '{additions: .additions, deletions: .deletions, fileCount: (.files | length)}' -``` - -**Risk signals**: -- PR > 500 lines changed with no evidence of review → **MEDIUM** -- PR > 1000 lines changed → **MEDIUM** (flag for scope assessment regardless) -- PR touches > 10 files → **LOW** (informational) - -#### 4e. Protected File Changes - -```bash -gh pr view --json files --jq '[.files[].path]' -``` - -Check if any changed files are in protected paths: -- `packages/kernel/src/**` — core governance kernel -- `packages/policy/src/**` — policy evaluation engine -- `packages/invariants/src/**` — invariant system -- `<%= paths.policy %>` — default policy -- `.claude/settings.json` — hook configuration - -**Risk signals**: -- Protected kernel/policy/invariant files modified → **HIGH** -- `<%= paths.policy %>` or `.claude/settings.json` modified → **CRITICAL** - -### 5. Score and Classify Each PR - -Assign risk scores: -- **CRITICAL** = 4 points -- **HIGH** = 3 points -- **MEDIUM** = 2 points -- **LOW** = 1 point - -For each PR, sum all risk scores. Classify the PR: - -| Total Score | Overall Risk | -|-------------|-------------| -| 0 | CLEAN | -| 1-2 | LOW | -| 3-4 | MEDIUM | -| 5-7 | HIGH | -| 8+ | CRITICAL | - -### 6. Generate Risk Report - -If any PR has risk level **MEDIUM or above**, generate a consolidated report. - -If no risks at MEDIUM or above, report "All merged PRs pass audit — no risks detected" and STOP. - -#### 6a. Check for Existing Open Audit Issue - -```bash -gh issue list --state open --label "source:merged-pr-audit" --json number,title --limit 1 -``` - -#### 6b. Compile the Report - -``` -## Merged PR Risk Audit Report - -**Audit period**: to -**PRs audited**: -**PRs with risks**: - -### Risk Summary - -| PR | Title | Merged By | Risk Level | Score | Top Finding | -|----|-------|-----------|------------|-------|-------------| -| # | | @<user> | <CRITICAL/HIGH/MEDIUM> | <score> | <top finding> | - -### Detailed Findings - -#### PR #<N>: <title> - -**Risk level**: <LEVEL> (score: <N>) -**Merged by**: @<user> on <date> -**Changes**: +<additions> -<deletions> across <file_count> files - -| Risk | Level | Details | -|------|-------|---------| -| <finding> | <CRITICAL/HIGH/MEDIUM/LOW> | <details> | - -<Repeat for each PR with risk >= MEDIUM> - -### Patterns Observed - -<Cross-PR patterns — e.g., "3 PRs merged with outstanding change requests", "CI was bypassed on 2 PRs"> - -### Recommendations - -<Actionable recommendations based on findings — e.g., "Enable branch protection to require CI passage", "Require review re-approval after new commits"> - ---- -*Automated audit by audit-merged-prs skill on <timestamp>* -``` - -#### 6c. Create or Update Issue - -If an existing open audit issue exists, comment on it: - -```bash -gh issue comment <ISSUE_NUMBER> --body "<audit report>" -``` - -If no existing issue, create one: - -```bash -gh issue create \ - --title "audit: Merged PR Risk Report — <start_date> to <end_date>" \ - --body "<full audit report>" \ - --label "audit" --label "source:merged-pr-audit" --label "priority:<highest risk level found>" -``` - -#### 6d. Mark Audited PRs - -For each audited PR (regardless of risk level), post a brief audit receipt comment: - -```bash -gh pr comment <PR_NUMBER> --body "**AgentGuard Merged PR Audit Bot** — audited - -Risk level: **<LEVEL>** (score: <N>) -Findings: <count> (<brief summary>) -Full report: #<ISSUE_NUMBER> - ---- -*Automated audit by audit-merged-prs skill on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" -``` - -### 7. Summary - -Report: -- **PRs audited**: N (list PR numbers) -- **Risk breakdown**: N CRITICAL, N HIGH, N MEDIUM, N LOW, N CLEAN -- **Issue created/updated**: #<ISSUE_NUMBER> (or "none — all PRs clean") -- **Top risks**: <brief list of most concerning findings> -- If clean: "All merged PRs pass audit — governance is healthy" - -## Rules - -- **Read-only** — never modify merged PRs, never revert merges, never reopen closed PRs -- **Never close existing audit issues** — only create new ones or comment on existing open ones -- **Only create an issue if risks at MEDIUM or above are found** — do not create noise for LOW/CLEAN results -- **Cap audit at 50 merged PRs per run** — if more exist, audit the 50 most recent and note the overflow -- **Skip PRs already audited** — check for `**AgentGuard Merged PR Audit Bot**` comment -- **Do not name or blame individuals** — focus on process gaps and systemic patterns, not personal attribution -- Be factual and objective in findings — avoid inflammatory language -- If `gh` CLI is not authenticated, report the error and STOP -- If no merged PRs exist in the audit window, report cleanly and STOP diff --git a/packages/swarm/templates/skills/backlog-steward.md b/packages/swarm/templates/skills/backlog-steward.md deleted file mode 100644 index e663fbc1..00000000 --- a/packages/swarm/templates/skills/backlog-steward.md +++ /dev/null @@ -1,123 +0,0 @@ -# Skill: Backlog Steward - -Expand ROADMAP items into GitHub issues. Cross-reference against open issues to avoid duplicates. Designed for daily scheduled execution. - -**Scope**: ROADMAP expansion ONLY. Code annotation scanning (TODO/FIXME/HACK) is handled by the Repo Hygiene Agent — do NOT scan annotations here to avoid duplicate issue creation. - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 1b. Check System Mode - -```bash -cat <%= paths.swarmState %> 2>/dev/null | grep -o '"mode":"[^"]*"' 2>/dev/null -``` - -- If mode is `safe`: output "System in SAFE MODE — skipping backlog expansion" and **STOP immediately** -- If mode is `conservative`: reduce cap to **1 issue per run** instead of 3 - -### 2. Scan ROADMAP Unchecked Items - -Read `<%= paths.roadmap %>` and extract all unchecked items: - -```bash -grep -n "\- \[ \]" <%= paths.roadmap %> -``` - -For each match, extract the item description and its parent section (Phase name). - -### 3. Fetch Issues for Deduplication - -Retrieve ALL open issues (any source agent) as the deduplication reference: - -```bash -gh issue list --state open --limit 200 --json number,title,body,labels -``` - -Also retrieve recently closed issues (last 30 days) to avoid re-filing resolved work: - -```bash -gh issue list --state closed --limit 100 --json number,title,labels,closedAt -``` - -Filter closed issues to only those closed within the last 30 days. - -### 4. Deduplicate (Strict Multi-Signal Matching) - -For each ROADMAP item, check whether an existing issue (open OR recently closed) already covers it. Use ALL of the following matching signals — a match on ANY signal means SKIP: - -**Signal 1 — Substring match**: The ROADMAP checkbox text appears as a substring in any issue title (case-insensitive). - -**Signal 2 — Keyword overlap**: Extract the 3-5 most distinctive keywords from the ROADMAP item (nouns and verbs, excluding common words like "add", "implement", "support", "the", "for", "with"). Extract the same from each issue title. If ≥60% of the ROADMAP item's keywords appear in an issue title, it is a match. - -**Signal 3 — Cross-agent label check**: Check issues with ANY `source:*` label, not just `source:backlog-steward`. Issues created by `source:roadmap-agent`, `source:planning-agent`, `source:test-agent`, or any other agent count as existing coverage. - -**Signal 4 — Closed issue recency**: If an issue matching Signals 1-3 was closed in the last 30 days, treat it as covered. Do NOT re-file work that was recently completed or intentionally closed. - -### 4b. Batch Dedup Verification - -Before creating ANY issues, compile the full list of proposed new issues (titles only). Review them as a batch and remove any that: -- Are semantically equivalent to each other (two proposed issues covering the same work) -- Are semantically equivalent to any existing open or recently-closed issue identified in Step 4 -- Describe work that is clearly a subset of an existing open issue - -Only the de-duplicated list proceeds to Step 5. - -### 5. Create Issues for New Items - -For each unmatched item (up to **3 per run**), create a GitHub issue: - -```bash -gh issue create \ - --title "<type>: <description>" \ - --body "## Source - -- **Type**: <TODO|FIXME|HACK|ROADMAP> -- **Location**: \`<file>:<line>\` (or <%= paths.roadmap %> section) -- **Original text**: <annotation text> - -## Task Description - -<Expanded description of what needs to be done based on the annotation context> - -## Labels - -Created automatically by the Backlog Steward skill. - ---- -*Discovered by backlog-steward on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" \ - --label "source:backlog-steward" --label "<%= labels.pending %>" -``` - -Add task type label for ROADMAP items → `task:implementation` - -Ensure the `source:backlog-steward` label exists before using it: - -```bash -gh label create "source:backlog-steward" --color "C5DEF5" --description "Auto-created by Backlog Steward skill" 2>/dev/null || true -``` - -### 6. Summary - -Report: -- **ROADMAP unchecked items**: N -- **Already tracked**: N (matched to existing issues) -- **New issues created**: N (list issue numbers and titles) -- **Skipped (cap reached)**: N (if more than 3 unmatched items exist) - -## Rules - -- Create a maximum of **3 new issues per run** — if more unmatched items exist, report the overflow count but do not create them -- **Never close, modify, or comment on existing issues** — this skill is create-only -- **Never create duplicate issues** — always check against open issues first (title substring match) -- **Do NOT scan code annotations** (TODO/FIXME/HACK) — that is the Repo Hygiene Agent's job -- If `gh` CLI is not authenticated, report the error and STOP -- If no unchecked ROADMAP items are found, report "Backlog clean — no new items discovered" and STOP -- Only create issues relevant to the current active ROADMAP phase and the next phase diff --git a/packages/swarm/templates/skills/cicd-hardening-audit.md b/packages/swarm/templates/skills/cicd-hardening-audit.md deleted file mode 100644 index f621a199..00000000 --- a/packages/swarm/templates/skills/cicd-hardening-audit.md +++ /dev/null @@ -1,249 +0,0 @@ -# Skill: CI/CD Hardening Audit - -Audit CI/CD pipeline configuration for security hardening, best practices, and governance integration. Verify that all workflows enforce required checks, use pinned action versions, have proper permissions, and integrate with AgentGuard governance. Designed for weekly scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If data is unavailable or ambiguous, proceed with available data and note limitations -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Audit Workflow Files - -Read all GitHub Actions workflow files: - -```bash -ls .github/workflows/*.yml .github/workflows/*.yaml 2>/dev/null -``` - -For each workflow file, check: - -#### 2a. Action Version Pinning - -```bash -grep -n "uses:" .github/workflows/<file> -``` - -Flag: -- Actions using `@main` or `@master` instead of a pinned SHA or version tag — **HIGH risk** (supply chain attack vector) -- Actions using floating tags like `@v4` instead of exact versions like `@v4.1.2` — **MEDIUM risk** - -#### 2b. Workflow Permissions - -```bash -grep -n "permissions:" .github/workflows/<file> -``` - -Flag: -- Workflows without explicit `permissions:` block — **HIGH risk** (gets default write-all) -- Workflows with `permissions: write-all` — **HIGH risk** (overly permissive) -- Workflows that should use `contents: read` but have `contents: write` — **MEDIUM risk** - -#### 2c. Secret Usage - -```bash -grep -n "secrets\." .github/workflows/<file> -``` - -Flag: -- Secrets passed to third-party actions — **MEDIUM risk** -- Secrets used in `run:` steps without environment variable indirection — **LOW risk** - -#### 2d. Trigger Configuration - -```bash -grep -n "on:" .github/workflows/<file> -``` - -Flag: -- `pull_request_target` trigger without proper safeguards — **HIGH risk** (can run untrusted code with repo secrets) -- Missing branch protection on push triggers — **MEDIUM risk** - -### 3. Check Branch Protection - -```bash -gh api repos/{owner}/{repo}/branches/main/protection 2>/dev/null -``` - -Verify: -- **Required status checks** are configured -- **Required reviews** count is >= 1 (or 0 if fully autonomous — note this) -- **Dismiss stale reviews** is enabled -- **Require up-to-date branches** is enabled -- **Enforce for administrators** is enabled - -If branch protection API fails (may require admin access), note "Branch protection: unable to query (may require admin access)" and continue. - -### 4. Audit GitHub Actions Security - -Check for common CI/CD security issues: - -#### 4a. Script Injection - -```bash -grep -rn '\${{.*github\.event' .github/workflows/ 2>/dev/null -``` - -Flag any use of `${{ github.event.* }}` in `run:` steps — **HIGH risk** (script injection via PR titles, comments, etc.) - -#### 4b. Artifact Security - -```bash -grep -n "actions/upload-artifact\|actions/download-artifact" .github/workflows/*.yml 2>/dev/null -``` - -Check for: -- Artifacts with sensitive data — **MEDIUM risk** -- Missing artifact retention limits — **LOW risk** - -### 5. Check AgentGuard Governance Integration - -Verify that CI/CD integrates with AgentGuard governance: - -```bash -grep -rn "agentguard" .github/workflows/ 2>/dev/null -``` - -Check for: -- `agentguard ci-check` in CI workflows — governance verification -- `agentguard evidence-pr` in PR workflows — evidence attachment -- Governance hook configuration in `.claude/settings.json` - -Flag: -- CI workflows that modify code without governance — **MEDIUM risk** -- PR workflows that don't attach governance evidence — **LOW risk** (nice-to-have) -- Missing `agentguard-governance.yml` reusable workflow usage — **LOW risk** - -### 6. Check Dependency Supply Chain - -```bash -grep -rn "npm ci\|npm install\|yarn install\|pnpm install" .github/workflows/ 2>/dev/null -``` - -Flag: -- `npm install` instead of `npm ci` in CI — **MEDIUM risk** (non-deterministic) -- Missing `--ignore-scripts` flag for untrusted deps — **LOW risk** -- Missing npm provenance in publish workflow — **MEDIUM risk** - -### 7. Generate Hardening Report - -Compose a structured report in markdown: - -**Header**: -- Audit timestamp (UTC) -- Number of workflows analyzed -- Overall hardening score (0-100 based on findings) - -**Findings Summary**: -| Severity | Count | Examples | -|----------|-------|----------| -| HIGH | N | <brief list> | -| MEDIUM | N | <brief list> | -| LOW | N | <brief list> | - -**Detailed Findings**: -For each finding: -- **Severity**: HIGH / MEDIUM / LOW -- **Category**: action-pinning / permissions / secrets / triggers / branch-protection / script-injection / supply-chain / governance -- **File**: workflow file and line number -- **Description**: What was found -- **Recommendation**: Specific fix -- **Auto-fixable**: Yes / No - -**Governance Integration Status**: -| Check | Status | -|-------|--------| -| CI governance verification | Present / Missing | -| PR evidence attachment | Present / Missing | -| Hook configuration | Valid / Missing / Invalid | - -**Hardening Score Calculation**: -- Start at 100 -- Deduct 15 per HIGH finding -- Deduct 5 per MEDIUM finding -- Deduct 1 per LOW finding -- Minimum score: 0 - -### 8. Publish Report - -Check if a previous hardening report exists: - -```bash -gh issue list --state open --label "source:cicd-hardening" --json number --jq '.[0].number' -``` - -If a previous report exists, close it: - -```bash -gh issue close <PREV_NUMBER> --comment "Superseded by new CI/CD hardening audit." -``` - -Create the new report: - -```bash -gh label create "source:cicd-hardening" --color "0E8A16" --description "CI/CD Hardening Audit" 2>/dev/null || true -gh issue create \ - --title "CI/CD Hardening Audit — $(date +%Y-%m-%d)" \ - --body "<hardening report markdown>" \ - --label "source:cicd-hardening" --label "<%= labels.pending %>" -``` - -### 9. Create Fix Issues for HIGH Findings - -For each HIGH severity finding that is auto-fixable, create a targeted fix issue: - -```bash -gh issue create \ - --title "fix(ci): <brief description of the finding>" \ - --body "## CI/CD Hardening Fix - -- **Severity**: HIGH -- **Finding**: <description> -- **File**: <workflow file> -- **Fix**: <specific change needed> - -Discovered by CI/CD Hardening Audit on $(date +%Y-%m-%d). - ---- -*Auto-created by cicd-hardening-audit skill*" \ - --label "source:cicd-hardening" --label "<%= labels.high %>" --label "task:implementation" --label "<%= labels.pending %>" -``` - -Cap at **2 fix issues per run**. - -### 10. Summary - -Report: -- **Workflows audited**: N -- **Hardening score**: N/100 -- **HIGH findings**: N -- **MEDIUM findings**: N -- **LOW findings**: N -- **Fix issues created**: N -- **Governance integration**: Complete / Partial / Missing -- **Top concern**: Brief statement of the most critical finding - -## Rules - -- Create a maximum of **1 hardening report issue per run** -- Create a maximum of **2 fix issues per run** -- **Never modify workflow files directly** — only report findings and create fix issues for the Coder Agent -- **Never close issues** — only close previous hardening report issues labeled `source:cicd-hardening` -- If `gh` CLI is not authenticated, report the error and STOP -- When assessing severity, err on the side of higher severity (flag rather than ignore) -- Always check actual file content — do not assume based on file names alone diff --git a/packages/swarm/templates/skills/claim-issue.md b/packages/swarm/templates/skills/claim-issue.md deleted file mode 100644 index 28b2cf19..00000000 --- a/packages/swarm/templates/skills/claim-issue.md +++ /dev/null @@ -1,75 +0,0 @@ -# Skill: Claim Issue - -Claim a discovered GitHub issue for the current agent session. Updates labels, creates a working branch, and posts a start comment. - -## Prerequisites - -Run `discover-next-issue` first to identify the issue number. - -## Steps - -### 1. Update Issue Status - -Remove the pending label and mark as in-progress: - -```bash -gh issue edit <ISSUE_NUMBER> --remove-label "<%= labels.pending %>" --add-label "<%= labels.inProgress %>" -``` - -If label update fails because the label does not exist on the repository, create it first: - -```bash -gh label create "<%= labels.inProgress %>" --color "0E8A16" --description "Agent is actively working on this" -``` - -Then retry the edit command. - -### 2. Determine Branch Name - -Map the task type label to a branch prefix: - -| Label | Branch Prefix | -|-------|--------------| -| `task:implementation` | `agent/implementation/issue-<N>` | -| `task:bug-fix` | `agent/bugfix/issue-<N>` | -| `task:refactor` | `agent/refactor/issue-<N>` | -| `task:test-generation` | `agent/tests/issue-<N>` | -| `task:documentation` | `agent/docs/issue-<N>` | -| (default) | `agent/task/issue-<N>` | - -### 3. Create Working Branch - -```bash -git checkout -b agent/<type>/issue-<ISSUE_NUMBER> -``` - -If the branch already exists (from a previous attempt): - -```bash -git checkout agent/<type>/issue-<ISSUE_NUMBER> -``` - -### 4. Verify Branch - -```bash -git branch --show-current -``` - -Confirm the output matches the expected branch name. - -### 5. Post Start Comment - -```bash -gh issue comment <ISSUE_NUMBER> --body "**AgentGuard Agent** — work started. - -- **Branch**: \`agent/<type>/issue-<ISSUE_NUMBER>\` -- **Governance**: Active (PreToolUse hooks enforcing policy) -- **Started**: $(date -u +%Y-%m-%dT%H:%M:%SZ)" -``` - -## Rules - -- If the branch already exists, check it out instead of creating a new one -- Always verify you are on the correct branch before proceeding -- If the issue is already `status:in-progress`, check if it was previously assigned — if so, resume work on the existing branch rather than starting fresh -- Do not claim more than one issue at a time diff --git a/packages/swarm/templates/skills/create-pr.md b/packages/swarm/templates/skills/create-pr.md deleted file mode 100644 index e6160daf..00000000 --- a/packages/swarm/templates/skills/create-pr.md +++ /dev/null @@ -1,201 +0,0 @@ -# Skill: Create Pull Request - -Create a pull request with a governance telemetry summary, risk assessment, and decision records. Pushes the branch, reads governance event data, runs pre-push simulation, generates a structured PR body, and updates the issue status. - -## Prerequisites - -All tests must pass — run `run-tests` first. - -## Steps - -### 1. Stage Governance Telemetry - -Stage governance event files so they are committed with the PR branch: - -```bash -git add .agentguard/events/*.jsonl 2>/dev/null || true -git add .agentguard/decisions/*.jsonl 2>/dev/null || true -git add <%= paths.logs %> 2>/dev/null || true -``` - -If no governance files exist yet, this is a no-op — proceed normally. - -### 2. Pre-Push Simulation - -Run impact simulation before pushing to assess blast radius and policy compliance: - -```bash -<%= paths.cli %> simulate --action git.push --branch $(git branch --show-current) --policy <%= paths.policy %> --json 2>/dev/null -``` - -Parse the JSON output for: -- **riskLevel**: low / medium / high -- **blastRadius**: weighted score -- **predictedChanges**: list of affected resources -- **policyResult**: allowed / denied - -If simulation shows a policy denial, report the denial reason and STOP — do not push a branch that would violate governance policy. - -If the simulate command is not available or fails, note "Simulation: not available" and proceed. - -### 3. Push Branch to Remote - -```bash -git push -u origin $(git branch --show-current) -``` - -If push fails due to remote rejection, diagnose and report. Do NOT force push. - -### 4. Collect Governance Telemetry - -Use the evidence-pr command in dry-run mode to collect and format governance telemetry: - -```bash -<%= paths.cli %> evidence-pr --last --dry-run --store sqlite 2>/dev/null -``` - -If the command fails or returns no output, fall back to JSONL mode: - -```bash -<%= paths.cli %> evidence-pr --last --dry-run 2>/dev/null -``` - -If no telemetry files exist, note "No governance telemetry recorded" — still proceed with PR creation. - -### 5. Collect Decision Records - -Read governance decision records for this session: - -```bash -ls -la .agentguard/decisions/ 2>/dev/null -cat .agentguard/decisions/*.jsonl 2>/dev/null | wc -l -cat .agentguard/decisions/*.jsonl 2>/dev/null | grep -c '"outcome":"deny"' || echo 0 -``` - -Parse decision records to extract: -- **Total decisions recorded** -- **Deny outcomes** and their reasons -- **Escalation levels** observed (NORMAL, ELEVATED, HIGH, LOCKDOWN) -- **Intervention types** (deny, rollback, pause, test-only) - -### 6. Compute Risk Score - -Run the analytics engine for a per-session risk assessment: - -```bash -<%= paths.cli %> analytics --format json 2>/dev/null | head -50 -``` - -Extract: -- **Risk score** (0-100) -- **Risk level** (low / medium / high / critical) -- **Top violation patterns** (if any) - -If the analytics command is not available, compute a basic risk level from telemetry: -- 0 denials + 0 violations → **low** -- 1-2 denials or violations → **medium** -- 3+ denials or any escalation → **high** -- Any LOCKDOWN event → **critical** - -### 7. Generate PR Body - -Use this structure: - -```markdown -## Summary -- <1-3 bullet points describing what was implemented> -- Closes #<ISSUE_NUMBER> - -## Changes -- <list of files modified with brief description of each change> - -## Test Plan -- [ ] TypeScript build passes (`pnpm build`) -- [ ] Vitest tests pass (`pnpm test`) -- [ ] ESLint clean (`pnpm lint`) -- [ ] Prettier clean (`pnpm format`) - -## Risk Assessment - -| Metric | Value | -|--------|-------| -| Risk level | <low/medium/high/critical> | -| Risk score | <N>/100 | -| Blast radius | <N> (weighted) | -| Simulation result | <allowed/denied/not available> | - -## Governance Report - -| Metric | Count | -|--------|-------| -| Total events | <N> | -| Actions allowed | <N> | -| Actions denied | <N> | -| Policy denials | <N> | -| Invariant violations | <N> | -| Escalation events | <N> | -| Decision records | <N> | - -<details> -<summary>Governance details</summary> - -**Source**: `.agentguard/events/`, `.agentguard/decisions/`, `<%= paths.logs %>` - -**Decision Records**: <N> total, <N> denials -**Escalation levels observed**: <list or "NORMAL only"> -**Pre-push simulation**: <risk level, blast radius, or "not available"> - -[List any notable denials or violations with their reasons] - -</details> -``` - -### 8. Create the PR - -```bash -gh pr create --title "<type>(issue-<N>): <concise title>" --body "<generated body>" -``` - -Use the issue title as the basis for the PR title. Keep it under 70 characters. Use conventional prefixes: `feat`, `fix`, `refactor`, `test`, `docs`. - -If a PR already exists for this branch: - -```bash -gh pr view --json url --jq '.url' -``` - -Update the existing PR instead: - -```bash -gh pr edit <PR_NUMBER> --body "<updated body>" -``` - -### 9. Update Issue Label - -```bash -gh issue edit <ISSUE_NUMBER> --remove-label "<%= labels.inProgress %>" --add-label "<%= labels.review %>" -``` - -### 10. Comment on Issue - -```bash -gh issue comment <ISSUE_NUMBER> --body "**AgentGuard Agent** — pull request created. - -- **PR**: <PR_URL> -- **Branch**: \`$(git branch --show-current)\` -- **Risk level**: <low/medium/high/critical> -- **Actions evaluated**: <N> -- **Denials**: <N> -- **Decision records**: <N> -- **Completed**: $(date -u +%Y-%m-%dT%H:%M:%SZ)" -``` - -## Rules - -- Do NOT force push — if push fails, diagnose and report -- If `gh pr create` fails because a PR already exists, update the existing PR -- If no governance telemetry exists, still create the PR but note "Governance telemetry: not available" in the report -- If pre-push simulation shows a policy denial, STOP and report — do not create a PR for policy-violating changes -- Mark all test plan checkboxes that passed during `run-tests` -- The PR title must be under 70 characters -- If analytics or simulation commands are not available, degrade gracefully and note the limitation diff --git a/packages/swarm/templates/skills/dependency-security-audit.md b/packages/swarm/templates/skills/dependency-security-audit.md deleted file mode 100644 index 29c587ea..00000000 --- a/packages/swarm/templates/skills/dependency-security-audit.md +++ /dev/null @@ -1,254 +0,0 @@ -# Skill: Dependency Security Audit - -Run security audits on project dependencies, check for known vulnerabilities, identify outdated packages, and review Dependabot alerts. Creates a high-priority issue if critical or high-severity vulnerabilities are found. Designed for periodic scheduled execution. - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. Also requires `npm` and `gh` CLI. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Run npm Audit - -```bash -npm audit --json 2>/dev/null || npm audit 2>/dev/null -``` - -Parse the output to extract: -- **Total vulnerabilities** by severity (critical, high, moderate, low) -- **Affected packages** and their vulnerable version ranges -- **Fix available**: whether a non-breaking fix exists (`npm audit fix --dry-run`) - -If `npm audit` reports 0 vulnerabilities, note "npm audit: clean". - -### 3. Check for Outdated Packages - -```bash -npm outdated --json 2>/dev/null || npm outdated 2>/dev/null -``` - -For each outdated package, extract: -- **Package name** -- **Current version** vs **latest version** -- **Update type**: patch, minor, or major (breaking) - -Categorize by risk: -- Major version behind on a runtime dependency → **HIGH** risk -- Major version behind on a dev dependency → **MODERATE** risk -- Minor/patch behind → **LOW** risk - -### 4. Check Dependabot Alerts - -```bash -gh api repos/{owner}/{repo}/dependabot/alerts --jq '[.[] | select(.state=="open")] | length' 2>/dev/null || echo "Dependabot API not available" -``` - -If alerts are available, list open ones: - -```bash -gh api repos/{owner}/{repo}/dependabot/alerts --jq '.[] | select(.state=="open") | {number: .number, package: .security_vulnerability.package.name, severity: .security_vulnerability.severity, summary: .security_advisory.summary}' 2>/dev/null -``` - -### 5. Check for Known Supply Chain Risks - -Review `package-lock.json` for concerning patterns: - -```bash -# Check total dependency count -node -e "const lock = require('./package-lock.json'); const deps = Object.keys(lock.packages || {}).filter(k => k !== ''); console.log('Total packages:', deps.length)" - -# Check for install scripts (potential supply chain risk) -node -e "const lock = require('./package-lock.json'); const pkgs = lock.packages || {}; Object.entries(pkgs).forEach(([name, info]) => { if (info.hasInstallScript) console.log('Install script:', name) })" 2>/dev/null -``` - -### 6. Check Fix Availability - -For any critical or high vulnerabilities, check if automated fixes are available: - -```bash -npm audit fix --dry-run 2>/dev/null -``` - -Report which vulnerabilities can be auto-fixed and which require manual intervention (breaking changes). - -### 7. License Compliance Check - -Check licenses of all dependencies for compatibility: - -```bash -npx license-checker --json --production 2>/dev/null | head -200 -``` - -If `license-checker` is not available, fall back to manual inspection: - -```bash -node -e " -const lock = require('./package-lock.json'); -const pkg = require('./package.json'); -const deps = Object.keys(pkg.dependencies || {}); -deps.forEach(d => { - try { - const p = require('./node_modules/' + d + '/package.json'); - console.log(d + ': ' + (p.license || 'UNKNOWN')); - } catch(e) { console.log(d + ': UNREADABLE'); } -}); -" -``` - -Flag problematic licenses: -- **GPL-2.0, GPL-3.0, AGPL** → **HIGH** risk (copyleft, may require source disclosure) -- **UNKNOWN, UNLICENSED** → **MEDIUM** risk (legal uncertainty) -- **MIT, Apache-2.0, BSD, ISC** → OK (permissive) - -### 8. Secret Detection in Configuration - -Scan project configuration files for accidentally committed secrets: - -```bash -grep -rn "password\|secret\|api.key\|token\|credential" .env* *.yaml *.json --include="*.env" --include="*.yaml" --include="*.json" 2>/dev/null | grep -v "node_modules\|package-lock\|dist" | head -20 -``` - -Also check git history for recently added secrets: - -```bash -git log --oneline -20 --diff-filter=A -- "*.env" "*.key" "*.pem" 2>/dev/null -``` - -Flag: -- Any `.env` files tracked in git → **CRITICAL** -- Any credential-like strings in YAML/JSON config → **HIGH** -- Any key/pem files committed → **CRITICAL** - -### 9. Compare with Previous Audit - -Check if a previous audit issue exists and compare: - -```bash -gh issue list --state all --label "source:security-audit" --json number,title,createdAt --limit 5 -``` - -If a previous audit exists, note: -- **New vulnerabilities** since last audit -- **Resolved vulnerabilities** since last audit -- **Trend**: improving, stable, or degrading - -### 10. Generate Report - -Compile findings into a structured report: - -``` -## Dependency Security Audit Report - -**Date**: <timestamp> -**Node version**: $(node -v) -**npm version**: $(npm -v) - -### Vulnerability Summary - -| Severity | Count | -|----------|-------| -| Critical | N | -| High | N | -| Moderate | N | -| Low | N | - -### Outdated Packages - -| Package | Current | Latest | Type | Risk | -|---------|---------|--------|------|------| -| <name> | <current> | <latest> | <major/minor/patch> | <HIGH/MODERATE/LOW> | - -### Dependabot Alerts - -| # | Package | Severity | Summary | -|---|---------|----------|---------| -| <N> | <name> | <severity> | <summary> | - -### Supply Chain Notes - -- Total dependency count: N -- Packages with install scripts: N - -### License Compliance - -| License | Count | Risk | -|---------|-------|------| -| MIT | N | OK | -| Apache-2.0 | N | OK | -| GPL-* | N | HIGH | -| UNKNOWN | N | MEDIUM | - -### Secret Detection - -| Finding | File | Severity | -|---------|------|----------| -| <description> | <file> | <CRITICAL/HIGH> | - -### Trend (vs. Previous Audit) - -- New vulnerabilities: N -- Resolved vulnerabilities: N -- Trend: Improving / Stable / Degrading - -### Recommendations - -<Actionable remediation steps ranked by severity> -``` - -### 11. Create or Update Issue (if critical/high found) - -If any critical or high-severity vulnerabilities, license issues, or secret detections exist, check for an existing audit issue: - -```bash -gh issue list --state open --label "source:security-audit" --json number,title --limit 1 -``` - -Ensure the label exists: - -```bash -gh label create "source:security-audit" --color "D93F0B" --description "Auto-created by Dependency Security Audit skill" 2>/dev/null || true -``` - -If an existing issue is open, update it with the latest findings: - -```bash -gh issue comment <ISSUE_NUMBER> --body "<updated audit report>" -``` - -If no existing issue is open, create one: - -```bash -gh issue create \ - --title "security-audit: <N> critical/<N> high vulnerabilities found" \ - --body "<full audit report>" \ - --label "source:security-audit" --label "<%= labels.critical %>" -``` - -Use `priority:critical` if any critical vulnerabilities or secrets detected, otherwise `priority:high`. - -### 12. Summary - -Report: -- **Vulnerabilities**: N critical, N high, N moderate, N low -- **Outdated packages**: N (N high-risk) -- **Dependabot alerts**: N open -- **Auto-fixable**: N vulnerabilities -- **License issues**: N (N high-risk copyleft, N unknown) -- **Secrets detected**: N -- **Trend**: improving/stable/degrading vs. previous audit -- **Issue**: created/updated/none needed -- If clean: "No security issues found — dependencies healthy" - -## Rules - -- **Never run `npm audit fix`** without `--dry-run` — this skill is analysis-only, not remediation -- **Never modify `package.json` or `package-lock.json`** — only read and report -- **Never close existing security issues** — only create new ones or comment on existing open ones -- If `npm audit` fails (no lockfile, network error), report the error and continue with other checks -- If Dependabot API is not available (permissions, not enabled), skip that step and continue -- If no vulnerabilities are found, report "Dependencies healthy" and STOP — do not create an issue -- If `gh` CLI is not authenticated, still generate the report to console but skip issue creation diff --git a/packages/swarm/templates/skills/discover-next-issue.md b/packages/swarm/templates/skills/discover-next-issue.md deleted file mode 100644 index 391fdc48..00000000 --- a/packages/swarm/templates/skills/discover-next-issue.md +++ /dev/null @@ -1,139 +0,0 @@ -# Skill: Discover Next Issue - -Find the next GitHub issue to work on from the project's issue queue. Issues are selected by the `status:pending` label, sorted by priority, and assessed for governance risk level. Escalation context is checked to avoid high-risk work during elevated governance states. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If no issues match criteria, report cleanly and **STOP** -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation - -## Prerequisites - -Run `start-governance-runtime` first. - -## Steps - -### 0. Read Swarm State (if available) - -Check for shared swarm state to inform issue selection: - -```bash -cat <%= paths.swarmState %> 2>/dev/null -``` - -If the file exists and is valid JSON, extract: -- **mode**: if `safe`, output "System in SAFE MODE — skipping issue discovery" and STOP immediately. If `conservative`, reduce to smallest-scope issues only (< 5 files in scope) -- **prQueueHealthy**: if `false`, output "PR queue unhealthy — skipping issue discovery" and STOP -- **currentPhase**: prefer issues aligned with this ROADMAP phase -- **priorities**: if present, prefer issues listed in the priorities array - -If the file does not exist or is invalid, proceed with standard discovery. - -### 1. Query Pending Issues - -```bash -gh issue list --label "<%= labels.pending %>" --state open --json number,title,labels,body --limit 20 -``` - -If no issues are returned, report "No work available" and STOP. - -### 2. Filter by Role - -From the returned issues, select only those with at least one of these labels: -- `role:developer` -- `task:implementation` -- `task:bug-fix` -- `task:refactor` -- `task:test-generation` -- `task:documentation` - -Exclude issues labeled `role:architect` or `role:auditor` (those require different capability bundles). - -### 3. Sort by Priority - -Order the filtered issues by priority label: - -1. `priority:critical` (highest) -2. `priority:high` -3. `priority:medium` -4. `priority:low` (lowest) - -Issues without a priority label are treated as `priority:low`. - -Select the highest-priority issue. - -### 4. Check Dependencies - -If the selected issue body contains a `## Dependencies` section with issue references (e.g., `#41, #39`), verify each dependency is closed: - -```bash -gh issue view <DEP_NUMBER> --json state --jq '.state' -``` - -If any dependency is still `OPEN`: -- Report: "Issue #N has unresolved dependencies: #X (open)" -- Skip this issue and select the next highest-priority issue -- Repeat until an issue with no blocking dependencies is found - -### 5. Check Escalation Context - -Before finalizing issue selection, check the current governance escalation level: - -```bash -cat <%= paths.logs %> 2>/dev/null | grep -i "escalat\|StateChanged" | tail -5 -``` - -Determine the current escalation state: -- **NORMAL** (level 0): all issues eligible -- **ELEVATED** (level 1): prefer issues with smaller File Scope (fewer files) -- **HIGH** (level 2): only select issues with explicit File Scope of 5 files or fewer -- **LOCKDOWN** (level 3): report "Governance LOCKDOWN active — deferring new work" and STOP - -If telemetry data is unavailable, assume NORMAL and proceed. - -### 6. Estimate Blast Radius - -For the selected issue, estimate the governance risk: - -If the issue body contains a `## File Scope` section, count the listed files. Then simulate: - -```bash -<%= paths.cli %> simulate --action file.write --target <first-file-in-scope> --policy <%= paths.policy %> --json 2>/dev/null -``` - -Classify the estimated blast radius: -- **1-5 files**: low risk -- **6-15 files**: medium risk -- **16+ files**: high risk - -If escalation is ELEVATED and the estimated blast radius is high, prefer the next lower-risk issue. - -If the simulate command is not available, skip this step. - -### 7. Display Issue Details - -For the selected issue, output: - -- **Issue number** and **title** -- **Labels** (all) -- **Estimated risk level**: low / medium / high (from blast radius estimate) -- **Current escalation**: NORMAL / ELEVATED / HIGH -- **Task Description** section from the body -- **Acceptance Criteria** section from the body -- **File Scope** section from the body (if present) -- **Dependencies** section (if present, with status of each) - -## Rules - -- If no pending issues exist, report "No work available" and STOP -- If all pending issues have unresolved dependencies, report "All pending issues blocked by dependencies" and STOP -- If governance is in LOCKDOWN, report and STOP — do not select any issue -- If escalation is HIGH, only select issues with small file scope (5 files or fewer) -- Do not select issues that are already `status:in-progress` or `status:assigned` -- Output the selected issue number clearly — it is needed by `claim-issue` diff --git a/packages/swarm/templates/skills/full-test.md b/packages/swarm/templates/skills/full-test.md deleted file mode 100644 index 8c9650d9..00000000 --- a/packages/swarm/templates/skills/full-test.md +++ /dev/null @@ -1,83 +0,0 @@ -# Skill: Full Test - -Run the complete build, type-check, test, lint, format, and coverage verification suite. This is the comprehensive "is everything OK?" check for the AgentGuard codebase. - -## Steps - -Run these in sequence. If any step fails, stop and analyze before proceeding. - -### 1. Build TypeScript - -```bash -pnpm build -``` - -Compiles all workspace packages via Turborepo. Report build success or failure with error details. - -### 2. Type-Check - -```bash -pnpm ts:check -``` - -Runs `tsc --noEmit` for strict type verification. Report any type errors with file:line references. - -### 3. Run Tests (vitest) - -```bash -pnpm test -``` - -Report pass/fail count. If tests fail, note the failing test names and error messages. - -### 4. Run ESLint - -```bash -pnpm lint -``` - -Report any lint errors with file:line references. - -### 5. Run Prettier Format Check - -```bash -pnpm format -``` - -Report any formatting issues. - -### 6. Run Coverage Check - -```bash -pnpm test:coverage -``` - -Report line coverage percentage. The project threshold is 50% line coverage. - -### 7. Summary - -Provide a structured pass/fail summary: - -``` -## Full Test Report - -| Check | Status | Details | -|-------|--------|---------| -| Build | PASS/FAIL | <error count or clean> | -| Type-check | PASS/FAIL | <error count or clean> | -| Tests (vitest) | PASS/FAIL | <X pass / Y fail> | -| Lint | PASS/FAIL | <error count or clean> | -| Format | PASS/FAIL | <issue count or clean> | -| Coverage | PASS/FAIL | <X% lines (threshold: 50%)> | -``` - -One-line verdict: -- **All clear**: "All 6 checks passed — codebase healthy" -- **Issues found**: "N/6 checks failed — see details above" - -## Rules - -- **Read-only** — do not fix, modify, or commit anything. This skill only reports. -- Run all steps even if earlier steps fail — report the full picture. -- If a command times out (>2 minutes), note the timeout and continue. -- If `node_modules` is missing, run `pnpm install` first, then proceed. diff --git a/packages/swarm/templates/skills/generate-tests.md b/packages/swarm/templates/skills/generate-tests.md deleted file mode 100644 index 75b807f4..00000000 --- a/packages/swarm/templates/skills/generate-tests.md +++ /dev/null @@ -1,146 +0,0 @@ -# Skill: Generate Tests - -Analyze untested source modules, generate test files following existing vitest patterns, and create a PR. Transforms the Test Agent from observer to actor. Designed for periodic scheduled execution after `test-health-review` identifies coverage gaps. - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active. If governance cannot be activated, STOP. - -### 2. Identify Untested Modules - -List source files and their corresponding test files: - -```bash -find packages apps -name "*.ts" -not -name "*.d.ts" -not -path "*/node_modules/*" -not -path "*/dist/*" | sort -find packages apps -name "*.test.ts" -not -path "*/node_modules/*" | sort -``` - -For each source file, check if a corresponding test file exists in the same package's `tests/` directory. For example, `packages/kernel/src/kernel.ts` should have a test at `packages/kernel/tests/kernel.test.ts`. - -Prioritize untested files by: -1. Core kernel files (`packages/kernel/src/`) — highest priority -2. Policy and invariant files (`packages/policy/src/`, `packages/invariants/src/`) -3. Adapter files (`packages/adapters/src/`) -4. CLI command files (`apps/cli/src/commands/`) -5. Utility files (`packages/core/src/`) - -### 3. Study Existing Test Patterns - -Read 2-3 existing test files to understand conventions: - -```bash -cat packages/kernel/tests/agentguard-aab.test.ts | head -50 -cat packages/kernel/tests/agentguard-engine.test.ts | head -50 -``` - -Extract patterns: -- Import style (`import { describe, it, expect } from 'vitest'`) -- Test structure (`describe` → `it` blocks) -- Assertion patterns (`expect(...).toBe(...)`, `expect(...).toThrow(...)`) -- Mock/stub patterns (how dependencies are isolated) -- Naming conventions for test descriptions - -### 4. Select Target Module - -Pick the highest-priority untested module (max 1 per run). Read the source file to understand: - -- Exported functions and classes -- Input types and return types -- Edge cases (null handling, error paths, boundary values) -- Dependencies that need mocking - -```bash -cat packages/<pkg>/src/<target>.ts -``` - -### 5. Generate Test File - -Create a test file following the established patterns: - -- File path: `packages/<pkg>/tests/<module-name>.test.ts` -- Import the module under test -- Use `describe` blocks matching exported functions/classes -- Include tests for: - - Happy path (normal inputs → expected outputs) - - Edge cases (empty inputs, null/undefined, boundary values) - - Error cases (invalid inputs → expected errors) - - Type contracts (return types match declarations) -- Use mocks/stubs for external dependencies (file system, network, git) - -### 6. Verify Tests Pass - -```bash -pnpm build -npx vitest run packages/<pkg>/tests/<module-name>.test.ts -``` - -If tests fail: -- Fix test logic (not source code — this skill writes tests, not fixes) -- Re-run until tests pass -- If tests cannot pass after 2 attempts, STOP and report the issue - -### 7. Commit and Create PR - -```bash -git checkout -b test/add-<module-name>-tests -git add packages/<pkg>/tests/<module-name>.test.ts -git commit -m "test: add tests for <module-name>" -git push -u origin test/add-<module-name>-tests -``` - -Create PR: - -```bash -gh pr create \ - --title "test: add tests for <module-name>" \ - --body "## Summary - -Adds test coverage for \`packages/<pkg>/src/<target>.ts\`. - -## Test Coverage - -- <N> test cases covering: - - Happy path scenarios - - Edge cases - - Error handling - -## Generated By - -Auto-generated by the \`generate-tests\` skill (Test Agent). - ---- -*source:test-agent*" \ - --label "source:test-agent" -``` - -Ensure label exists: - -```bash -gh label create "source:test-agent" --color "0E8A16" --description "Auto-created by Test Agent" 2>/dev/null || true -``` - -### 8. Summary - -Report: -- **Module tested**: `packages/<pkg>/src/<path>` -- **Test file created**: `packages/<pkg>/tests/<module-name>.test.ts` -- **Test cases**: N (N passing) -- **PR created**: #<N> -- If no untested modules found: "All source modules have test coverage" - -## Rules - -- **Generate tests for 1 module per run** — do not batch multiple modules. -- **Never modify source code** — only create test files. If tests fail because of a bug in the source, report the bug but do not fix it. -- **Follow existing test patterns exactly** — match the import style, describe/it structure, and assertion patterns from existing tests. -- **Never delete existing test files** — only add new ones. -- **Never overwrite existing test files** — if a test file already exists for the target module, skip it. -- If `pnpm build` fails, STOP — the codebase must compile before tests can be generated. -- If `gh` CLI is not authenticated, still generate the test file locally but skip PR creation. -- Target modules must be in `packages/*/src/` or `apps/*/src/` — do not generate tests for files outside the source tree. diff --git a/packages/swarm/templates/skills/governance-log-audit.md b/packages/swarm/templates/skills/governance-log-audit.md deleted file mode 100644 index 6437d5fe..00000000 --- a/packages/swarm/templates/skills/governance-log-audit.md +++ /dev/null @@ -1,302 +0,0 @@ -# Skill: Governance Log Audit - -Analyze governance event logs for cross-session trends, escalation trajectory, risk score progression, and per-agent governance compliance. Uses the AgentGuard analytics engine for aggregation and decision records for rich outcome analysis. Focuses on historical pattern analysis and compliance reporting — leave real-time anomaly detection to the Observability Agent, and policy quality analysis to `policy-effectiveness-review`. Creates an issue if actionable findings exist. Designed for periodic scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If data is unavailable or ambiguous, proceed with available data and note limitations -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance — even log analysis should be auditable. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Locate Log Files - -```bash -ls -la .agentguard/events/*.jsonl 2>/dev/null -ls -la .agentguard/decisions/*.jsonl 2>/dev/null -ls -la <%= paths.logs %> 2>/dev/null -``` - -If no log files exist, report "No governance logs found — nothing to audit" and STOP. - -### 3. Run Cross-Session Analytics - -Use the AgentGuard analytics engine for aggregated cross-session data: - -```bash -<%= paths.cli %> analytics --format json 2>/dev/null | head -200 -``` - -Extract: -- **Total sessions** analyzed -- **Per-session risk scores** and risk levels -- **Violation clustering**: which action types, targets, and branches produce the most violations -- **Denial rate trend**: increasing, stable, or decreasing across sessions -- **Top violation patterns**: recurring invariant or policy violations - -If the analytics command is not available, fall back to manual counting in Step 4. - -### 4. Count Events by Type - -Count each governance event type across all log files: - -```bash -cat .agentguard/events/*.jsonl 2>/dev/null | grep -c "ActionRequested" || echo 0 -cat .agentguard/events/*.jsonl 2>/dev/null | grep -c "ActionAllowed" || echo 0 -cat .agentguard/events/*.jsonl 2>/dev/null | grep -c "ActionDenied" || echo 0 -cat .agentguard/events/*.jsonl 2>/dev/null | grep -c "PolicyDenied" || echo 0 -cat .agentguard/events/*.jsonl 2>/dev/null | grep -c "InvariantViolation" || echo 0 -cat .agentguard/events/*.jsonl 2>/dev/null | grep -c "ActionEscalated" || echo 0 -cat .agentguard/events/*.jsonl 2>/dev/null | grep -c "BlastRadiusExceeded" || echo 0 -cat .agentguard/events/*.jsonl 2>/dev/null | grep -c "MergeGuardFailure" || echo 0 -``` - -Also count total events: - -```bash -cat .agentguard/events/*.jsonl 2>/dev/null | wc -l -``` - -### 5. Analyze Decision Records - -Read decision records for richer outcome analysis: - -```bash -cat .agentguard/decisions/*.jsonl 2>/dev/null | head -200 -``` - -Parse each `GovernanceDecisionRecord` and aggregate: -- **Outcome distribution**: allow vs. deny counts -- **Intervention types**: deny, rollback, pause, test-only (count each) -- **Escalation levels**: Distribution of NORMAL through LOCKDOWN -- **Top denial reasons**: Group by reason, count occurrences -- **Execution success rate**: succeeded vs. failed -- **Per-session risk scores**: Extract and track over time - -### 6. Compute Metrics - -Calculate key governance health metrics: - -- **Denial rate**: `(ActionDenied + PolicyDenied) / ActionRequested * 100` -- **Invariant violation rate**: `InvariantViolation / ActionRequested * 100` -- **Escalation count**: total ActionEscalated events -- **Average risk score**: mean of per-session risk scores - -Flag these thresholds: -- Denial rate > 20% → **WARNING** -- Denial rate > 50% → **CRITICAL** -- Any InvariantViolation → **WARNING** -- Any ActionEscalated → **WARNING** -- Any BlastRadiusExceeded → **WARNING** -- Average risk score > 50 → **WARNING** -- Any session risk score > 70 → **CRITICAL** - -### 7. Analyze Per-Agent Compliance - -Group events by agent identity (extract from event metadata): - -```bash -cat .agentguard/events/*.jsonl 2>/dev/null | grep "ActionDenied\|PolicyDenied" | head -100 -``` - -For each agent: -- **Total actions requested** -- **Denial count and rate** -- **Types of denials** (policy vs. invariant) -- **Compliance score**: `(allowed / total) * 100` - -Identify: -- **Compliant agents**: denial rate <5% -- **Boundary-testing agents**: denial rate 5-20% -- **Non-compliant agents**: denial rate >20% (persistent bad behavior) - -### 8. Analyze Cross-Session Trends - -If multiple log files exist (each representing a session), compare across sessions: - -```bash -ls -lt .agentguard/events/*.jsonl 2>/dev/null | head -10 -``` - -For the last 5 sessions, compute: -- Denial rate per session (is it trending up or down?) -- Risk score per session (from analytics or decision records) -- Escalation levels reached per session -- Most common denial reason per session -- Session duration and event volume - -Look for: -- **Improving trend**: denial rate decreasing, risk scores declining across sessions (agents learning) -- **Degrading trend**: denial rate increasing, risk scores rising (new bad patterns emerging) -- **Escalation trajectory**: are sessions reaching higher escalation levels over time? - -### 9. Check Escalation History - -Read all escalation-related events across sessions: - -```bash -cat .agentguard/events/*.jsonl 2>/dev/null | grep -i "escalat\|lockdown" | tail -20 -``` - -Build an escalation timeline: -- When did each escalation occur? -- What action triggered it? -- Did the system recover (de-escalate) or remain elevated? -- Any LOCKDOWN events → **CRITICAL** - -### 10. Generate Report - -Compile the audit findings into a structured report: - -``` -## Governance Log Audit Report - -**Date**: <timestamp> -**Log files analyzed**: <count> -**Decision records analyzed**: <count> -**Total events**: <N> -**Sessions covered**: <N> - -### Event Summary - -| Event Type | Count | -|------------|-------| -| ActionRequested | N | -| ActionAllowed | N | -| ActionDenied | N | -| PolicyDenied | N | -| InvariantViolation | N | -| ActionEscalated | N | -| BlastRadiusExceeded | N | - -### Health Metrics - -| Metric | Value | Status | -|--------|-------|--------| -| Denial rate | X% | OK/WARNING/CRITICAL | -| Invariant violation rate | X% | OK/WARNING | -| Escalation events | N | OK/WARNING | -| Average risk score | N/100 | OK/WARNING/CRITICAL | - -### Risk Score Trend - -| Session | Date | Risk Score | Risk Level | Trend | -|---------|------|------------|------------|-------| -| <id> | <date> | N/100 | low/medium/high/critical | ↑/↓/→ | - -### Decision Record Analysis - -| Metric | Value | -|--------|-------| -| Total decisions | N | -| Deny outcomes | N | -| Intervention types | deny: N, rollback: N, pause: N | -| Execution success rate | N% | - -### Per-Agent Compliance - -| Agent | Actions | Denials | Compliance | Status | -|-------|---------|---------|------------|--------| -| <agent> | N | N | X% | COMPLIANT/BOUNDARY/NON-COMPLIANT | - -### Cross-Session Trends - -| Session | Date | Events | Denial Rate | Risk Score | Max Escalation | -|---------|------|--------|-------------|------------|----------------| -| <id> | <date> | N | X% | N/100 | NORMAL/ELEVATED/HIGH/LOCKDOWN | - -**Trend**: Improving / Stable / Degrading - -### Escalation Timeline - -<Chronological list of escalation events with triggers and recovery> - -### Recommendations - -<Actionable recommendations focused on agent compliance, risk reduction, and trend direction> -``` - -### 11. Create or Update Issue (if actionable) - -If any WARNING or CRITICAL findings exist, check for an existing audit issue: - -```bash -gh issue list --state open --label "source:governance-audit" --json number,title --limit 1 -``` - -Apply the `report-routing` protocol: - -**If CRITICAL findings** (LOCKDOWN events, escalation trending toward critical, sustained high denial rate >20%) → **ALERT tier**: - -Ensure the label exists: - -```bash -gh label create "source:governance-audit" --color "D93F0B" --description "Auto-created by Governance Log Audit skill" 2>/dev/null || true -``` - -```bash -gh issue create \ - --title "ALERT: governance-audit — <summary of critical finding>" \ - --body "<critical findings with evidence>" \ - --label "source:governance-audit" --label "<%= labels.critical %>" -``` - -**If actionable findings but not critical → REPORT tier** (write to local file): - -```bash -mkdir -p .agentguard/reports -cat > .agentguard/reports/governance-audit-$(date +%Y-%m-%d).md <<'REPORT_EOF' -<full audit report> -REPORT_EOF -``` - -Close any previous governance audit issues that are not critical: - -```bash -PREV=$(gh issue list --state open --label "source:governance-audit" --json number,labels --jq '[.[] | select(.labels | map(.name) | index("<%= labels.critical %>") | not)] | .[].number' 2>/dev/null) -for num in $PREV; do - gh issue close "$num" --comment "Superseded — audit reports now written to .agentguard/reports/" 2>/dev/null || true -done -``` - -**If all metrics nominal → LOG tier**: - -```bash -mkdir -p .agentguard/logs -echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) [governance-audit] Governance logs nominal. Denial rate: N%. Violations: N." >> .agentguard/logs/swarm.log -``` - -### 12. Summary - -Report the audit findings to the console, including: -- Total events analyzed -- Decision records analyzed -- Key metrics (denial rate, violation rate, average risk score) -- Number of warnings and critical findings -- Risk score trend direction -- Output routed to: ALERT / REPORT / LOG -- "Governance logs nominal" if no actionable findings - -## Rules - -- **Routine audit reports go to `.agentguard/reports/`, NOT GitHub issues** — follow the report-routing protocol -- Create a maximum of **1 alert issue per run** — only for CRITICAL findings -- **Read-only on log files** — never modify, truncate, or delete governance logs -- If no log files exist, report cleanly and STOP — do not error -- If all metrics are within thresholds, log and STOP — do not create an issue or report file -- Cap pattern analysis at 20 events per type to avoid excessive processing -- If `gh` CLI is not authenticated, still generate the report to console but skip issue creation diff --git a/packages/swarm/templates/skills/implement-issue.md b/packages/swarm/templates/skills/implement-issue.md deleted file mode 100644 index ea43df14..00000000 --- a/packages/swarm/templates/skills/implement-issue.md +++ /dev/null @@ -1,155 +0,0 @@ -# Skill: Implement Issue - -Execute the implementation work described in the claimed GitHub issue. Reads the issue for requirements, respects file scope, validates changes against governance policy via simulation, follows coding conventions, and commits changes. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If requirements are ambiguous, implement the most conservative interpretation and note assumptions in the commit message -- If governance activation fails, log the failure and **STOP** — do not ask what to do -- If a policy simulation denies a file change, **skip that file** and note the denial in the commit message -- Default to the **safest option** in every ambiguous situation - -## Prerequisites - -Run `claim-issue` first. Must be on the working branch. - -## Steps - -### 1. Read Issue Details - -```bash -gh issue view <ISSUE_NUMBER> --json body,title --jq '.body' -``` - -Extract from the body: -- **Task Description** — what to implement -- **Acceptance Criteria** — success conditions (checklist items) -- **File Scope** — allowed paths (if specified) -- **Protected Paths** — paths that must NOT be modified - -### 2. Verify Branch - -```bash -git branch --show-current -``` - -Must match `agent/<type>/issue-<N>`. If not on the correct branch, STOP. - -### 3. Implement Changes - -Follow these coding conventions (from CLAUDE.md): - -- **camelCase** for functions and variables -- **UPPER_SNAKE_CASE** for constants -- **const/let** only, no `var` -- Arrow functions preferred -- `import type` for type-only imports (`verbatimModuleSyntax: true`) -- Single quotes, trailing commas (es5), printWidth 100, tabWidth 2, semicolons -- Node.js >= 18 - -If a **File Scope** section exists in the issue, only modify files matching the listed paths. If you need to modify a file outside the scope, note it as a scope extension request in the PR body later. - -If a **Protected Paths** section exists, do NOT modify those files. The kernel will deny the action via policy, but avoid triggering denials proactively. - -### 4. Pre-Commit Policy Simulation - -Before committing, validate each modified file against governance policy: - -```bash -git diff --name-only HEAD -``` - -For each modified file, run simulation: - -```bash -<%= paths.cli %> simulate --action file.write --target <file> --policy <%= paths.policy %> --json 2>/dev/null -``` - -Check the simulation result: -- If **allowed**: proceed with the file -- If **denied**: do NOT commit that file — note the policy violation and the denial reason - -If simulation shows a denial, attempt to resolve: -1. Check if the file is in a protected path (kernel, policy, invariants) — if so, verify the issue explicitly authorizes it -2. Check if the file matches a deny rule in `<%= paths.policy %>` — if so, note it as a governance constraint - -If the simulate command is not available, skip this step and proceed. - -### 5. Type-Check - -```bash -pnpm ts:check -``` - -If type errors exist in files you modified, fix them before proceeding. Do not skip type errors. - -### 6. Lint - -```bash -pnpm lint -``` - -If lint errors exist in files you modified: - -```bash -pnpm lint:fix -``` - -If errors remain after auto-fix, fix them manually. - -### 7. Format Check - -```bash -pnpm format -``` - -If formatting issues exist in files you modified: - -```bash -pnpm format:fix -``` - -### 8. Commit Changes - -Stage only the files you modified — do NOT use `git add .` or `git add -A`: - -```bash -git add <specific-files> -git commit -m "<type>(issue-<N>): <concise description> - -Implements #<ISSUE_NUMBER> - -- <bullet point summary of changes>" -``` - -Use conventional commit prefixes based on the task type label: -- `task:implementation` -> `feat` -- `task:bug-fix` -> `fix` -- `task:refactor` -> `refactor` -- `task:test-generation` -> `test` -- `task:documentation` -> `docs` - -If the task requires multiple logical units of work, make separate commits for each. - -### 9. Capture Governance Decision - -After commit, capture the governance decision record for audit trail: - -```bash -<%= paths.cli %> inspect --last 2>/dev/null -``` - -This records the governance decisions made during implementation, which will be included in the PR body by the `create-pr` skill. - -## Rules - -- Do NOT modify files in `packages/kernel/src/**`, `packages/policy/src/**`, or `packages/invariants/src/**` unless the issue explicitly authorizes it -- Do NOT modify `<%= paths.policy %>` or `.claude/settings.json` -- Do NOT use `git add .` or `git add -A` — stage specific files only -- If pre-commit simulation denies a file, do NOT commit it — report the denial -- If you cannot complete the implementation, commit what you have and note incomplete items in the PR body -- Write tests for new functionality when the task type is `task:implementation` or `task:bug-fix` diff --git a/packages/swarm/templates/skills/marketing-content.md b/packages/swarm/templates/skills/marketing-content.md deleted file mode 100644 index eeaa5559..00000000 --- a/packages/swarm/templates/skills/marketing-content.md +++ /dev/null @@ -1,253 +0,0 @@ -# Skill: Marketing Content Generator - -Generate a weekly content calendar with ready-to-post drafts for LinkedIn, Twitter/X, and blog posts based on recent project activity, releases, and milestones. Publish the content pack as a GitHub issue for human review before posting. Designed for weekly scheduled execution. - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Gather Recent Activity - -Collect all project activity from the last 7 days to use as content source material. - -**Merged PRs** (primary content source): - -```bash -gh pr list --state merged --limit 20 --json number,title,body,mergedAt,labels,additions,deletions,headRefName -``` - -Filter to only PRs merged in the last 7 days. - -**Closed issues**: - -```bash -gh issue list --state closed --limit 30 --json number,title,body,closedAt,labels -``` - -Filter to only issues closed in the last 7 days. - -**Recent releases**: - -```bash -gh release list --limit 5 --json tagName,name,publishedAt,body,isPrerelease -``` - -Filter to releases published in the last 7 days. Flag these as **high-priority content** — each release gets a dedicated announcement post. - -**Roadmap progress** (for narrative context): - -```bash -cat <%= paths.roadmap %> -``` - -Parse the current phase, overall progress percentage, and any recently checked-off items. - -### 3. Categorize Content Themes - -Group the gathered activity into content themes. Each theme becomes a content piece: - -| Theme Type | Source | Content Priority | -|---|---|---| -| **Release Announcement** | New release published | HIGH — always draft | -| **Feature Spotlight** | Merged PR with `task:feature` label or significant additions | HIGH | -| **Technical Deep Dive** | Merged PR with complex changes (>200 lines) or architectural impact | MEDIUM | -| **Milestone Update** | ROADMAP phase completion or significant progress | MEDIUM | -| **Bug Fix Roundup** | Multiple merged PRs with `task:bug-fix` label | LOW | -| **Community/OSS** | New contributors, dependency updates, security fixes | LOW | -| **Behind the Scenes** | Interesting governance events, agent swarm activity, CI improvements | LOW | - -Select the **top 5 themes** by priority for the weekly content calendar. Always include release announcements if any exist. - -### 4. Generate Content Calendar - -Create a 5-day content calendar (Monday through Friday) assigning one theme per day: - -- **Monday**: Strongest theme (release announcement or major feature) — highest engagement day -- **Tuesday**: Technical deep dive or feature spotlight -- **Wednesday**: Milestone update or behind-the-scenes -- **Thursday**: Community-focused or educational content -- **Friday**: Lighter content — roundup, tips, or forward-looking teaser - -If fewer than 5 themes were identified, consolidate to fewer days and note which days to skip. - -### 5. Draft LinkedIn Posts - -For each calendar slot, draft a LinkedIn post following this structure: - -**Format guidelines:** -- **Hook line**: First line must grab attention (question, bold statement, or surprising metric). This line appears before the "see more" fold — it must compel the click. -- **Body**: 3-5 short paragraphs. Use line breaks liberally. LinkedIn rewards readability. -- **Specifics**: Include concrete numbers (lines of code, PRs merged, issues closed, test counts). Avoid vague claims. -- **Narrative**: Frame technical work in terms of the problem it solves, not the implementation details. Speak to the "why." -- **CTA**: End with a question or call-to-action (try the tool, star the repo, share thoughts). -- **Hashtags**: 3-5 relevant hashtags at the end. Always include `#OpenSource` and `#DevTools`. Add topic-specific tags. -- **Length**: 800-1300 characters (LinkedIn sweet spot for engagement). -- **Tone**: Conversational but knowledgeable. First person. Enthusiastic without being salesy. Share the builder's perspective. - -**For release announcements specifically:** -- Lead with the version number and the single most impactful change -- Include a "what's new" bullet list (max 5 items) -- Link to the release/changelog -- Frame it as a milestone in the larger vision - -### 6. Draft Twitter/X Posts - -For each calendar slot, draft a Twitter/X post (thread if needed): - -**Format guidelines:** -- **Single tweet**: 280 characters max. Punchy, direct, one key point. -- **Thread (for releases or features)**: 3-5 tweets max. First tweet must stand alone. Number tweets (1/N format). -- **Visuals**: Note where a screenshot, diagram, or code snippet would add value (mark as `[IMAGE: description]`). -- **Tone**: More casual than LinkedIn. Technical audience. Can use dev humor. -- **Hashtags**: 1-2 max on Twitter. `#opensource` plus one topic tag. - -### 7. Draft Blog Post Outlines - -For the **top 2 themes** of the week, create a blog post outline: - -**Outline structure:** -- **Title**: SEO-friendly, specific (not generic) -- **Subtitle**: One-sentence hook -- **Target audience**: Who should read this -- **Sections** (3-5): Section title + 2-3 bullet points of what to cover -- **Key code examples**: Note which code snippets or configurations to showcase -- **Estimated length**: Word count target (800-1500 words) - -These are outlines only — not full blog posts. They give the human author a head start. - -### 8. Check for Previous Content Pack - -Look for existing content pack issues: - -```bash -gh issue list --state open --label "source:marketing-agent" --json number --jq '.[0].number' -``` - -If a previous content pack exists, close it with a forward reference: - -```bash -gh issue close <PREV_NUMBER> --comment "Superseded by new weekly content pack." -``` - -### 9. Publish Content Pack Issue - -Ensure the label exists: - -```bash -gh label create "source:marketing-agent" --color "FFA500" --description "Auto-created by Marketing Content Agent" 2>/dev/null || true -``` - -Create the content pack issue: - -```bash -gh issue create \ - --title "Weekly Content Pack — $(date +%Y-%m-%d)" \ - --body "<content pack markdown>" \ - --label "source:marketing-agent" --label "<%= labels.pending %>" -``` - -**Issue body structure:** - -```markdown -## Weekly Content Pack - -**Generated**: <timestamp UTC> -**Period**: <7-day date range> -**Activity summary**: <N PRs merged, N issues closed, N releases> - ---- - -## Content Calendar - -| Day | Theme | Platform Focus | -|---|---|---| -| Monday | ... | LinkedIn + Twitter | -| Tuesday | ... | LinkedIn | -| ... | ... | ... | - ---- - -## LinkedIn Posts - -### Monday — <Theme Title> - -<full draft post text> - ---- - -### Tuesday — <Theme Title> - -<full draft post text> - ---- - -(repeat for each day) - ---- - -## Twitter/X Posts - -### Monday — <Theme Title> - -<draft tweet or thread> - ---- - -(repeat for each day) - ---- - -## Blog Post Outlines - -### 1. <Blog Title> - -<outline> - -### 2. <Blog Title> - -<outline> - ---- - -## Source Material - -<bulleted list of PR numbers, issue numbers, and release tags used as source> - ---- - -*Generated by marketing-content-agent on <timestamp>* -``` - -### 10. Summary - -Report: -- **Activity scanned**: N PRs merged, N issues closed, N releases -- **Themes identified**: N (list theme types) -- **Content calendar days filled**: N/5 -- **LinkedIn posts drafted**: N -- **Twitter/X posts drafted**: N -- **Blog outlines created**: N -- **Content pack issue created**: #N -- **Previous pack closed**: #N (or "none") - -## Rules - -- Create a maximum of **1 content pack issue per run** -- **Never post content to any external platform** — this skill only creates GitHub issue drafts for human review -- **Never close issues** — only close previous content pack issues labeled `source:marketing-agent` -- **Never modify other issues** — this skill is create-only for its own content pack -- **Never fabricate metrics** — only use real numbers from the GitHub data gathered in Step 2 -- **Never invent features or capabilities** — only describe what was actually merged or released -- If no significant activity occurred in the last 7 days, create a minimal content pack noting the quiet week and suggest evergreen content topics instead -- If `gh` CLI is not authenticated, report the error and STOP -- Release announcements always take priority — if a release was published, it must appear in the calendar -- Blog outlines are suggestions only — keep them concise (not full drafts) -- Hashtag recommendations should be relevant to the actual content, not generic padding -- All content should be written from the perspective of the project maintainer/builder -- Content should be authentic and technical — avoid marketing buzzwords and hype language diff --git a/packages/swarm/templates/skills/observability-review.md b/packages/swarm/templates/skills/observability-review.md deleted file mode 100644 index 7e9e992e..00000000 --- a/packages/swarm/templates/skills/observability-review.md +++ /dev/null @@ -1,481 +0,0 @@ -# Skill: Observability Review - -Analyze runtime telemetry, governance event patterns, decision records, risk score trends, CI pipeline trends, and build metrics to surface operational health signals. Detect anomalies, regressions, and trends that other agents cannot see. Publish an Observability Report. Designed for daily scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If data is unavailable or ambiguous, proceed with available data and note limitations -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation -- When in doubt about anomaly severity, round **up** (flag rather than ignore) - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Collect Cross-Session Analytics - -Use the AgentGuard analytics engine for aggregated cross-session data: - -```bash -<%= paths.cli %> analytics --format json 2>/dev/null | head -200 -``` - -Extract: -- **Per-session risk scores** (trend over time) -- **Violation clustering** by dimension (action type, branch, target) -- **Cross-session denial rate** trend -- **Top violation patterns** (recurring invariant or policy violations) - -If analytics is not available, fall back to manual aggregation in Step 3. - -### 3. Collect Governance Telemetry - -Read the runtime telemetry log: - -```bash -cat <%= paths.logs %> 2>/dev/null | tail -500 -``` - -Parse each line as JSON with schema: -``` -{timestamp, agent, run_id, syscall, target, capability, policy_result, invariant_result} -``` - -If the file does not exist or is empty, note "No telemetry data available" and continue with other data sources. - -Aggregate: -- **Total events** in the log -- **Events in last 24 hours**: Filter by timestamp -- **Events in last 7 days**: Filter by timestamp -- **Action type distribution**: Count by `syscall` (file.read, file.write, git.push, etc.) -- **Policy result distribution**: Count of allow vs. deny -- **Invariant result distribution**: Count of pass vs. fail -- **Agent distribution**: Count by `agent` field -- **Denial rate**: deny / total as percentage (last 24h and 7d) -- **Invariant failure rate**: fail / total as percentage (last 24h and 7d) - -### 4. Analyze Decision Records and Risk Scores - -List available decision log files: - -```bash -ls -la .agentguard/decisions/ 2>/dev/null | tail -20 -``` - -Read the most recent decision logs (up to 5 files, most recent first): - -```bash -for f in $(ls -t .agentguard/decisions/*.jsonl 2>/dev/null | head -5); do cat "$f"; done -``` - -Parse each `GovernanceDecisionRecord` and aggregate: -- **Outcome distribution**: allow vs. deny counts -- **Intervention types**: deny, rollback, pause, test-only (count each) -- **Escalation levels observed**: Distribution of 0 (NORMAL) through 3 (LOCKDOWN) -- **Top denial reasons**: Group by `reason` field, count occurrences -- **Invariant violations**: Group by invariant name, count occurrences -- **Policy matches**: Group by `policy.matchedPolicyName`, count occurrences -- **Execution success rate**: executed actions that succeeded vs. failed -- **Average decision-to-execution time**: From `execution.durationMs` where available -- **Per-session risk scores**: Extract risk score from each session's decision records - -### 5. Check Tracepoint Data - -Look for kernel-level tracepoint data for performance and pipeline health: - -```bash -grep "tracepoint\|trace_kind" <%= paths.logs %> 2>/dev/null | tail -50 -``` - -If tracepoint data is available, extract: -- **Kernel pipeline latency**: Time spent in aab.normalize, policy.evaluate, invariant.check stages -- **Slow operations**: Any tracepoint with duration > 100ms -- **Adapter dispatch failures**: Failed adapter.dispatch tracepoints - -If no tracepoint data exists, note "Tracepoint data: not available" and skip. - -### 6. Detect Anomalies - -Compare recent patterns (last 24h) against baseline (last 7d) to detect: - -**Escalation anomalies**: -- Any LOCKDOWN events (escalation level 3) — always flag as critical -- HIGH escalation events (level 2) — flag if more than 2 in 24h -- Escalation level increasing over time (trend) - -**Denial rate anomalies**: -- Denial rate >20% in last 24h (high denial signal) -- Denial rate increased >10 percentage points vs. 7-day average -- Single agent responsible for >50% of denials - -**Invariant violation anomalies**: -- Any new invariant type that wasn't violated in prior 7 days -- Invariant violation rate >5% (high violation signal) -- Repeated violations of the same invariant (>3 in 24h) - -**Risk score anomalies**: -- Per-session risk score >70 (high risk session) -- Risk score trend increasing over last 5 sessions -- Any session with risk level "critical" - -**Volume anomalies**: -- Event volume dropped >50% vs. 7-day daily average (agents may be stalled) -- Event volume spiked >200% vs. 7-day daily average (unusual activity) - -### 7. Analyze CI Pipeline Health - -Fetch recent CI workflow runs: - -```bash -gh run list --limit 30 --json databaseId,conclusion,headBranch,createdAt,name,updatedAt -``` - -Calculate: -- **Overall pass rate**: % with conclusion "success" (last 30 runs) -- **Pass rate trend**: Compare last 10 runs vs. prior 20 runs -- **Failure breakdown**: Group failures by workflow name and branch -- **Mean time to recovery (MTTR)**: Average time between a failure and next success on the same branch -- **Currently failing branches**: Branches where the most recent run failed - -Fetch workflow run details for recent failures (up to 3): - -```bash -gh run view <RUN_ID> --json jobs --jq '.jobs[] | select(.conclusion == "failure") | {name, conclusion, steps: [.steps[] | select(.conclusion == "failure") | .name]}' -``` - -Identify: -- **Failure hotspots**: Which CI jobs fail most often (lint, typecheck, test, build) -- **Flaky patterns**: Same branch/commit with both pass and fail results - -### 8. Analyze Build Metrics - -Check the current build output: - -```bash -ls -la apps/cli/dist/bin.js 2>/dev/null -ls -la apps/cli/dist/bin.js.map 2>/dev/null -``` - -Record: -- **CLI bundle size**: File size of `apps/cli/dist/bin.js` -- **Source map size**: File size of `apps/cli/dist/bin.js.map` - -Check dependency health: - -```bash -pnpm audit --json 2>/dev/null | head -100 -pnpm outdated --json 2>/dev/null | head -50 -``` - -Record: -- **Vulnerability count**: By severity (critical, high, moderate, low) -- **Outdated packages**: Count and list of outdated dependencies - -### 9. Analyze Agent Activity Patterns - -From the telemetry data (Step 3), analyze per-agent behavior: - -For each unique agent in the telemetry: -- **Action volume**: Total actions in last 24h -- **Action types**: Distribution of syscall types -- **Denial rate**: Per-agent denial percentage -- **Target patterns**: Most frequently targeted files/paths -- **Activity timeline**: When the agent was most active (hour buckets) - -Detect: -- **Idle agents**: Agents with no activity in last 24h that were active in prior 7 days -- **Hyperactive agents**: Agents with >100 actions in last 24h -- **Permission-seeking agents**: Agents with denial rate >30% -- **Narrow-scope agents**: Agents that only touch 1-2 file paths repeatedly - -### 10. Check Scheduled Agent Health - -Verify all scheduled agents are running: - -```bash -gh issue list --state open --label "source:planning-agent" --limit 1 --json number,createdAt -gh issue list --state open --label "source:product-agent" --limit 1 --json number,createdAt -gh issue list --state open --label "source:test-agent" --limit 1 --json number,createdAt -gh issue list --state open --label "source:backlog-steward" --limit 1 --json number,createdAt -``` - -For each agent, check if it has produced output recently: -- **Healthy**: Output issue exists and was created/updated in last 48h -- **Stale**: Output issue exists but is older than 48h -- **Missing**: No output issue found (agent may not be running) - -### 11. Generate Observability Report - -Compose a structured report in markdown: - -**Header**: -- Generation timestamp (UTC) -- HEAD commit SHA -- Reporting period (last 24h with 7d baseline) - -**System Health Dashboard**: -| Metric | Last 24h | 7-Day Avg | Trend | Status | -Showing: event volume, denial rate, invariant failure rate, CI pass rate, escalation level, risk score. - -Use status indicators: -- `HEALTHY` — metric within normal range -- `WARNING` — metric approaching threshold -- `CRITICAL` — metric exceeds threshold or anomaly detected - -**Risk Score Trend**: -| Session | Date | Risk Score | Risk Level | -Showing per-session risk scores for the last 5-10 sessions, with trend arrow. - -**Governance Event Summary** (table): -| Action Type | Total | Allowed | Denied | Denial Rate | -Broken down by syscall type. - -**Decision Record Summary** (table): -| Metric | Value | -Showing: total decisions, deny outcomes, intervention types, escalation levels observed. - -**Anomalies Detected** (list): -Each anomaly with: -- Severity (CRITICAL / WARNING / INFO) -- Description -- Evidence (specific numbers and comparisons) -- Recommended action - -**Top Denial Reasons** (table, top 10): -| Reason | Count | % of Denials | Affected Agents | - -**Invariant Health** (table): -| Invariant | Violations (24h) | Violations (7d) | Status | - -**CI Pipeline Metrics**: -- Pass rate (with trend arrow) -- Failure hotspots -- MTTR -- Currently failing branches -- Flaky test signals - -**Build Metrics**: -- Bundle size -- Vulnerability summary -- Outdated dependency count - -**Agent Activity Matrix** (table): -| Agent | Actions (24h) | Denial Rate | Top Syscall | Status | -Showing activity for each detected agent. - -**Scheduled Agent Health** (table): -| Agent | Last Output | Age | Status | -Showing liveness for all scheduled agents. - -**Trend Analysis**: -- 7-day governance activity trend (daily totals) -- Denial rate trend (is it increasing, decreasing, or stable?) -- CI pass rate trend -- Risk score trend (per-session over time) - -**Recommendations** (numbered, max 5): -Top 5 operational actions prioritized by severity: -1. Critical anomalies to investigate -2. Failing CI to fix -3. Agents that need attention -4. Policy gaps to address -5. Infrastructure improvements - -### 12. Route Output (Report Routing Protocol) - -Apply the `report-routing` protocol to determine where output goes: - -**Assess severity**: Check if ANY of the following critical conditions exist: -- LOCKDOWN event detected -- CRITICAL anomalies found -- CI completely broken (0% pass rate) -- Risk score >50 -- Sustained denial rate >20% -- Deadlock or livelock detected - -**If critical conditions exist → ALERT tier**: - -Check for existing alert from this agent: - -```bash -gh issue list --state open --label "source:observability-agent" --label "<%= labels.critical %>" --json number,title -``` - -If no existing alert covers the anomaly: - -```bash -gh issue create \ - --title "ALERT: <anomaly description> — $(date +%Y-%m-%d)" \ - --body "<anomaly details with evidence and recommended action>" \ - --label "source:observability-agent" --label "<%= labels.critical %>" --label "<%= labels.pending %>" -``` - -Cap at **1 alert issue per run**. Do NOT create a separate "Observability Report" issue. - -**Always write the full report to REPORT tier** (regardless of alert): - -```bash -mkdir -p .agentguard/reports -cat > .agentguard/reports/observability-agent-$(date +%Y-%m-%d).md <<'REPORT_EOF' -<full observability report markdown> -REPORT_EOF -``` - -Close any previous observability report issues that are still open: - -```bash -PREV=$(gh issue list --state open --label "source:observability-agent" --json number --jq '[.[] | select(.labels | map(.name) | index("<%= labels.critical %>") | not)] | .[].number' 2>/dev/null) -for num in $PREV; do - gh issue close "$num" --comment "Superseded — reports now written to .agentguard/reports/" 2>/dev/null || true -done -``` - -**If no anomalies detected → LOG tier** (in addition to REPORT file): - -```bash -mkdir -p .agentguard/logs -echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) [observability-agent] No anomalies. Denial rate: N%. CI pass rate: N%. Risk: N/100." >> .agentguard/logs/swarm.log -``` - -### 14. Swarm Health Check - -Analyze control plane health and include a "## Swarm Health" section in the report. - -#### 14a. PR Queue Depth - -```bash -gh pr list --author @me --state open --json number --jq length -``` - -- If count > 10: flag as "PR queue overloaded" (CRITICAL) -- If count > 5: flag as "PR queue elevated" (WARNING) - -#### 14b. Issue Creation Rate - -```bash -gh issue list --label "source:backlog-steward" --state open --json createdAt --jq '[.[] | select(.createdAt > "'$(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%SZ)'")] | length' 2>/dev/null -``` - -- If count > 10: flag as "Issue flood detected" (WARNING) - -#### 14c. Merge Conflict Count - -```bash -gh pr list --state open --json mergeable --jq '[.[] | select(.mergeable == "CONFLICTING")] | length' -``` - -- If count > 3: flag as "Merge conflict cascade" (WARNING) - -#### 14d. Sprint Plan Freshness - -```bash -gh issue list --label "source:planning-agent" --limit 1 --state open --json createdAt --jq '.[0].createdAt' -``` - -- If older than 48h: flag as "Sprint plan stale" (WARNING) - -#### 14e. Update Swarm State - -Read `<%= paths.swarmState %>` if it exists. Update with: -- `openAgentPRs`: PR count from 14a -- `prQueueHealthy`: true if count < 8 -- `mergeConflicts`: count from 14c -- `lastObservabilityRun`: current ISO timestamp - -Write the updated file back. If the file doesn't exist, create it with these fields. - -#### 14f. Deadlock & Livelock Detection - -Check for swarm-level deadlocks and livelocks: - -**Deadlock patterns** (agents waiting on each other, no progress possible): - -```bash -# All PRs blocked by the same failing test -gh pr list --state open --json number,statusCheckRollup --jq '[.[] | select(.statusCheckRollup != null) | select([.statusCheckRollup[] | select(.conclusion == "FAILURE")] | length > 0)] | length' -``` - -- If ALL open PRs fail the same CI check: flag as "Deadlock: all PRs blocked by same CI failure" (CRITICAL) -- If all PRs are CONFLICTING and the Merge Conflict Resolver hasn't produced output in 24h: flag as "Deadlock: conflict cascade with stalled resolver" (CRITICAL) - -**Livelock patterns** (agents active but no forward progress): - -```bash -# PRs opened and closed repeatedly on same issue -gh pr list --state closed --limit 30 --json number,title,headRefName,closedAt,mergedAt --jq '[.[] | select(.mergedAt == null)]' -``` - -- If 3+ PRs were closed-without-merge on the same issue in 7 days: flag as "Livelock: repeated failed attempts on same issue" (WARNING) -- If the same PR has been rebased 5+ times without merging: flag as "Livelock: rebase loop" (WARNING) - -```bash -# Check for circular dependency blocking -gh issue list --state open --label "<%= labels.blocked %>" --json number,body --limit 20 -``` - -- If issue A is blocked by issue B AND issue B references issue A: flag as "Deadlock: circular dependency" (WARNING) - -**Starvation patterns** (some work never gets done): - -```bash -# Issues older than 30 days with no PR activity -gh issue list --state open --json number,title,createdAt,labels --jq '[.[] | select(.createdAt < "'$(date -u -d '30 days ago' +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || date -u -v-30d +%Y-%m-%dT%H:%M:%SZ)'")]' 2>/dev/null -``` - -- If 5+ issues are older than 30 days with no linked PR: flag as "Starvation: old issues never picked up" (WARNING) - -#### 14g. Include in Report - -Add a "## Swarm Health" section to the observability report with a table: - -| Metric | Value | Status | -|--------|-------|--------| -| Open agent PRs | N | HEALTHY/WARNING/CRITICAL | -| Issues created (24h) | N | HEALTHY/WARNING | -| Merge conflicts | N | HEALTHY/WARNING | -| Sprint plan age | Nh | HEALTHY/WARNING | -| Swarm state age | Nh | HEALTHY/WARNING | -| Deadlocks detected | N | HEALTHY/WARNING/CRITICAL | -| Livelocks detected | N | HEALTHY/WARNING | -| Starved issues (30d+) | N | HEALTHY/WARNING | - -### 15. Summary - -Report: -- **Governance events (24h)**: N total, N% denial rate, N% invariant failure rate -- **Escalation level**: NORMAL / ELEVATED / HIGH / LOCKDOWN -- **Risk score**: <N>/100 (<risk level>) -- **CI pass rate**: N% (trend: improving / stable / declining) -- **Anomalies detected**: N (N critical, N warning, N info) -- **Scheduled agents healthy**: N of M -- **Observability report created**: #N -- **Alerts raised**: N -- **Top concern**: Brief statement of the single most important operational finding - -## Rules - -- **Routine reports go to `.agentguard/reports/`, NOT GitHub issues** — follow the report-routing protocol -- Create a maximum of **1 alert issue per run** — only for CRITICAL anomalies -- **Never modify governance logs** — this agent is strictly read-only on telemetry data -- **Never modify source code or tests** — only report findings -- **Never close issues** — only close previous observability report issues labeled `source:observability-agent` -- **Never fix CI failures** — that is the CI Triage Agent's job -- **Never re-prioritize issues** — that is the Planning Agent's job -- If telemetry data is missing, still produce a report from available CI and GitHub data -- If `gh` CLI is not authenticated, report the error and STOP -- Do not create duplicate alert issues — check for existing ones first -- When closing previous reports, verify the issue is actually labeled `source:observability-agent` before closing -- Anomaly thresholds should be applied conservatively — flag only when evidence is clear -- The observability agent watches OTHER agents but never takes action on their behalf diff --git a/packages/swarm/templates/skills/policy-effectiveness-review.md b/packages/swarm/templates/skills/policy-effectiveness-review.md deleted file mode 100644 index be19d374..00000000 --- a/packages/swarm/templates/skills/policy-effectiveness-review.md +++ /dev/null @@ -1,290 +0,0 @@ -# Skill: Policy Effectiveness Review - -Analyze the effectiveness of governance policies and invariants. Identify rules that never trigger, detect policy gaps, assess invariant coverage, recommend policy packs, and suggest governance evolution. This is the Governance Agent's unique capability — focused on policy quality, not operational telemetry (which is the Observability Agent's domain). Designed for periodic scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If data is unavailable or ambiguous, proceed with available data and note limitations -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active. If governance cannot be activated, STOP. - -### 2. Read Active Policy - -```bash -cat <%= paths.policy %> 2>/dev/null -``` - -If no policy file exists, check for alternative formats: - -```bash -cat agentguard.json 2>/dev/null -ls policy/*.json 2>/dev/null -``` - -Parse the policy to extract: -- **Total rules**: Count of policy rules -- **Rule types**: deny vs. allow rules -- **Scopes**: File patterns, branch patterns, action types covered -- **Conditions**: Branch conditions, environment conditions - -### 3. Validate Policy Quality - -Run automated policy validation with strict best-practice checks: - -```bash -<%= paths.cli %> policy validate --strict --json 2>/dev/null -``` - -Parse the validation output for: -- **Errors**: Invalid rules, syntax issues -- **Warnings**: Unrecognized action types, missing descriptions, overlapping rules, rule shadowing -- **Best-practice violations**: No deny rules, missing blast radius limits - -If the validate command is not available, skip this step and rely on manual analysis. - -### 4. Read Invariant Definitions - -```bash -cat packages/invariants/src/definitions.ts -``` - -Extract the 8 built-in invariant names and their trigger conditions: -1. No secret exposure (severity 5) -2. Protected branches (severity 4) -3. Blast radius limit (severity 3) -4. Test-before-push (severity 3) -5. No force push (severity 4) -6. No skill modification (severity 4) -7. No scheduled task modification (severity 5) -8. Lockfile integrity (severity 2) - -### 5. Analyze Policy Rule Usage - -Read governance logs to determine which rules are actually triggered: - -```bash -cat .agentguard/events/*.jsonl 2>/dev/null | grep "PolicyDenied\|ActionAllowed\|ActionDenied" | head -200 -``` - -For each policy rule: -- Count how many times it matched (either allow or deny) -- Identify rules that have **never** been triggered ("dead rules") -- Identify rules that trigger most frequently ("hot rules") -- Identify rules that only deny (may be too restrictive) -- Identify rules that only allow (may be too permissive) - -### 6. Analyze Invariant Effectiveness - -From governance logs: - -```bash -cat .agentguard/events/*.jsonl 2>/dev/null | grep "InvariantViolation" | head -100 -``` - -For each of the 8 invariants: -- Count violation frequency -- Identify invariants that have **never** been violated (may indicate: well-behaved agents OR overly broad invariant that catches nothing real) -- Identify invariants that are violated repeatedly (may indicate: agents don't understand the boundary OR the invariant is too strict) -- Check if violations lead to denials or warnings - -### 7. Detect Policy Gaps - -Analyze action patterns that pass all policy checks but might be concerning: - -```bash -cat .agentguard/events/*.jsonl 2>/dev/null | grep "ActionAllowed" | head -100 -``` - -Look for: -- **Unscoped actions**: Actions on paths not covered by any specific policy rule (falls through to default) -- **Novel action types**: Action types that appear in logs but have no dedicated policy rule -- **High-frequency allows**: Actions that are always allowed — should any of them be restricted? -- **Uncovered branches**: Git branches where actions occur but no branch-specific policy exists - -Also run simulation against common risky patterns to detect coverage gaps: - -```bash -<%= paths.cli %> simulate --action file.write --target .env.production --policy <%= paths.policy %> --json 2>/dev/null -<%= paths.cli %> simulate --action git.push --branch main --policy <%= paths.policy %> --json 2>/dev/null -<%= paths.cli %> simulate --action shell.exec --command "rm -rf /" --policy <%= paths.policy %> --json 2>/dev/null -``` - -If any of these simulations show "allowed", flag as a policy gap. - -### 8. Analyze Available Policy Packs - -Read the available policy packs to recommend composition strategies: - -```bash -ls policies/*/agentguard-pack.yaml 2>/dev/null -``` - -For each policy pack found, read its description and rules: - -```bash -head -10 policies/ci-safe/agentguard-pack.yaml 2>/dev/null -head -10 policies/enterprise/agentguard-pack.yaml 2>/dev/null -head -10 policies/strict/agentguard-pack.yaml 2>/dev/null -head -10 policies/open-source/agentguard-pack.yaml 2>/dev/null -``` - -Compare the active policy against available packs: -- If the active policy lacks CI safety rules → recommend `ci-safe` pack -- If the active policy lacks enterprise controls (blast radius limits, credential protection) → recommend `enterprise` pack -- If the active policy has permissive defaults → recommend `strict` pack -- If the project is open-source → recommend `open-source` pack - -### 9. Cross-Reference with Architecture - -Compare policy coverage against the architectural layers: - -- Does the policy cover all workspace packages? (kernel, events, policy, invariants, adapters, cli, core) -- Are protected paths (`packages/kernel/src/`, `packages/invariants/src/`) reflected in policy rules? -- Does the blast radius invariant align with the actual module structure? -- Are there action types in `packages/core/src/actions.ts` with no corresponding policy rule? - -### 10. Generate Effectiveness Report - -``` -## Policy Effectiveness Report - -**Date**: <timestamp> -**Policy file**: <path> -**Total rules**: N -**Total invariants**: 8 -**Policy validation**: <pass/N errors/N warnings> - -### Automated Validation Results - -<Output from `agentguard policy validate --strict` if available> - -### Rule Usage Summary - -| Rule | Matches | Denials | Allows | Status | -|------|---------|---------|--------|--------| -| <rule-name/pattern> | N | N | N | ACTIVE/DEAD/HOT | - -### Dead Rules (never triggered) - -<List of rules that have never matched any action — candidates for removal or revision> - -### Hot Rules (most triggered) - -<Top 5 most frequently triggered rules — verify they are working as intended> - -### Invariant Effectiveness - -| Invariant | Severity | Violations | Status | -|-----------|----------|-----------|--------| -| no-secret-exposure | 5 | N | ACTIVE/DORMANT | -| protected-branch | 4 | N | ACTIVE/DORMANT | -| blast-radius-limit | 3 | N | ACTIVE/DORMANT | -| test-before-push | 3 | N | ACTIVE/DORMANT | -| no-force-push | 4 | N | ACTIVE/DORMANT | -| no-skill-modification | 4 | N | ACTIVE/DORMANT | -| no-scheduled-task-modification | 5 | N | ACTIVE/DORMANT | -| lockfile-integrity | 2 | N | ACTIVE/DORMANT | - -### Policy Gaps - -<List of detected coverage gaps with evidence> - -### Policy Pack Recommendations - -| Pack | Relevance | Reason | -|------|-----------|--------| -| ci-safe | HIGH/MEDIUM/LOW | <why this pack would help> | -| enterprise | HIGH/MEDIUM/LOW | <why this pack would help> | -| strict | HIGH/MEDIUM/LOW | <why this pack would help> | -| open-source | HIGH/MEDIUM/LOW | <why this pack would help> | - -### Governance Evolution Recommendations - -<Prioritized list of policy/invariant changes:> -1. Rules to add (fill gaps) -2. Rules to remove (dead rules) -3. Rules to tighten (too permissive) -4. Rules to relax (too restrictive) -5. Policy packs to compose -6. New invariants to consider -``` - -### 11. Route Output (Report Routing Protocol) - -Ensure label exists: - -```bash -gh label create "source:governance-agent" --color "5319E7" --description "Auto-created by Governance Agent" 2>/dev/null || true -``` - -**If critical policy gaps or validation errors found → ALERT tier**: - -```bash -gh issue create \ - --title "ALERT: governance policy issues — <N> critical findings" \ - --body "<critical policy findings>" \ - --label "source:governance-agent" --label "<%= labels.critical %>" -``` - -**If actionable recommendations exist (but not critical) → REPORT tier**: - -```bash -mkdir -p .agentguard/reports -cat > .agentguard/reports/governance-agent-$(date +%Y-%m-%d).md <<'REPORT_EOF' -<full effectiveness report with recommendations> -REPORT_EOF -``` - -Close any previous non-critical governance review issues: - -```bash -PREV=$(gh issue list --state open --label "source:governance-agent" --json number,labels --jq '[.[] | select(.labels | map(.name) | index("<%= labels.critical %>") | not)] | .[].number' 2>/dev/null) -for num in $PREV; do - gh issue close "$num" --comment "Superseded — policy reviews now written to .agentguard/reports/" 2>/dev/null || true -done -``` - -**If all policies effective → LOG tier**: - -```bash -mkdir -p .agentguard/logs -echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) [governance-agent] Policies effective. N rules analyzed, N active, 0 gaps." >> .agentguard/logs/swarm.log -``` - -### 12. Summary - -Report: -- **Rules analyzed**: N total, N active, N dead -- **Invariants**: N of 20 active, N dormant -- **Policy validation**: pass / N errors / N warnings -- **Policy gaps**: N detected -- **Policy pack recommendations**: N packs suggested -- **Recommendations**: N governance evolution items -- **Output routed to**: ALERT / REPORT / LOG -- If all healthy: "Governance policies effective — no changes recommended" - -## Rules - -- **Routine policy reviews go to `.agentguard/reports/`, NOT GitHub issues** — follow the report-routing protocol -- Create a maximum of **1 alert issue per run** — only for critical policy gaps or validation errors -- **Never modify policy files** — only analyze and recommend. -- **Never modify invariant definitions** — only assess effectiveness. -- This skill focuses on **policy quality** — leave operational metrics to the Observability Agent. -- If no governance logs exist, analyze the policy file statically (rules without usage data) and note the limitation. -- If `gh` CLI is not authenticated, still generate the report to console but skip issue creation. -- Cap log analysis at 200 events per category to keep processing bounded. diff --git a/packages/swarm/templates/skills/pr-merger.md b/packages/swarm/templates/skills/pr-merger.md deleted file mode 100644 index 9511f482..00000000 --- a/packages/swarm/templates/skills/pr-merger.md +++ /dev/null @@ -1,127 +0,0 @@ -# Skill: PR Merger - -Auto-merge pull requests that have passed all quality gates: CI passing, no merge conflicts, reviews approved or no changes requested, and all threads resolved. Designed for periodic scheduled execution as the final step in the autonomous SDLC pipeline. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If any step fails, log the error and move on to the next PR -- Default to the **safest option** in every ambiguous situation (skip merge > attempt merge) - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 1b. Check System Mode - -```bash -cat <%= paths.swarmState %> 2>/dev/null | grep -o '"mode":"[^"]*"' 2>/dev/null -``` - -- If mode is `safe`: output "System in SAFE MODE — skipping PR merging" and **STOP immediately** -- If mode is `conservative`: only merge PRs with `blast_radius < 200` lines (additions + deletions) AND at least 1 approving review from a human (not just agent review) - -### 2. Ensure Labels Exist - -```bash -gh label create "merge:failed" --color "D93F0B" --description "Auto-merge failed" 2>/dev/null || true -gh label create "do-not-merge" --color "B60205" --description "Do not auto-merge this PR" 2>/dev/null || true -``` - -### 3. List Candidate PRs - -```bash -gh pr list --state open --json number,title,headRefName,mergeable,isDraft,labels,createdAt,reviewDecision,statusCheckRollup --limit 20 -``` - -### 4. Filter by Merge Criteria - -A PR is eligible for auto-merge ONLY if ALL conditions are met: - -1. **Not a draft**: `isDraft` is `false` -2. **No blocking labels**: Does NOT have `conflict:needs-human`, `do-not-merge`, `needs:refinement`, or `blocked` -3. **Mergeable**: `mergeable` is `MERGEABLE` (not `CONFLICTING` or `UNKNOWN`) -4. **Age gate**: `createdAt` is more than 1 hour ago (gives review agents time to review) -5. **CI passing**: All status checks in `statusCheckRollup` have state `SUCCESS`, `NEUTRAL`, or `SKIPPED` -6. **Review state**: Either `reviewDecision` is `APPROVED`, or there are no `CHANGES_REQUESTED` reviews and all review threads are resolved - -If no PRs match all criteria, report "No PRs meet merge criteria. Skipping." and STOP. - -### 5. Process Eligible PRs - -For each eligible PR (max 3 per run, oldest first): - -#### 5a. Double-Check CI Status - -```bash -gh pr checks <NUMBER> --json name,state --jq '.[] | select(.state != "SUCCESS" and .state != "NEUTRAL" and .state != "SKIPPED")' -``` - -If any non-passing checks: skip this PR, log "CI not fully passing for PR #N" - -#### 5b. Double-Check Mergeable State - -```bash -gh pr view <NUMBER> --json mergeable --jq .mergeable -``` - -If not `MERGEABLE`: skip this PR, log "PR #N not mergeable" - -#### 5c. Check Review Threads - -```bash -gh api repos/{owner}/{repo}/pulls/<NUMBER>/reviews --jq '[.[] | select(.state == "CHANGES_REQUESTED")] | length' -``` - -If any reviews with `CHANGES_REQUESTED` that haven't been dismissed: skip this PR - -#### 5d. Merge - -```bash -gh pr merge <NUMBER> --squash --delete-branch -``` - -If merge fails: -- Log the error -- Add label `merge:failed` -- Post a comment: "Auto-merge failed: <error message>. Skipping." -- Continue to next PR - -If merge succeeds: -- Post a comment: "Auto-merged by PR Merger Agent. All quality gates passed: CI green, no conflicts, reviews clear." - -#### 5e. Cooldown - -After each successful merge, wait 10 seconds before processing the next PR. This allows CI to re-evaluate other PRs against the new main. - -### 6. Summary - -Report: -- **PRs eligible**: N -- **PRs merged**: N (list PR numbers and titles) -- **PRs skipped**: N (list with reasons: CI failing, conflicts, reviews pending, etc.) -- **PRs not eligible**: N (total open minus eligible) -- If clean: "No PRs meet merge criteria" - -## Rules - -- Merge a maximum of **3 PRs per run** -- **NEVER merge PRs with `do-not-merge` or `conflict:needs-human` labels** -- **NEVER merge draft PRs** -- **NEVER force merge** — only standard squash merge -- **NEVER merge if CI has ANY failing required checks** -- **NEVER merge if there are unresolved `CHANGES_REQUESTED` reviews** -- If unsure about any condition: **SKIP** the PR (do not merge) -- After each merge, wait 10 seconds before the next (let CI re-evaluate) -- Always delete the branch after merge (`--delete-branch`) -- If `gh` CLI is not authenticated, report the error and STOP -- The age gate (1 hour) ensures review agents have time to post comments before merge diff --git a/packages/swarm/templates/skills/product-health-review.md b/packages/swarm/templates/skills/product-health-review.md deleted file mode 100644 index 97f9e4e5..00000000 --- a/packages/swarm/templates/skills/product-health-review.md +++ /dev/null @@ -1,260 +0,0 @@ -# Skill: Product Health Review - -Evaluate whether the autonomous SDLC is building the right things. Audit roadmap alignment, phase progress, issue quality, value drift, and feature completeness. Publish a Product Health Report. Designed for daily scheduled execution. - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Snapshot Current State - -Fetch all open issues, open PRs, and recent activity: - -```bash -gh issue list --state open --limit 200 --json number,title,body,labels,createdAt,updatedAt -gh pr list --state open --json number,title,headRefName,labels,body,additions,deletions,createdAt -gh issue list --state closed --limit 30 --json number,title,closedAt,labels,body -gh pr list --state merged --limit 15 --json number,title,mergedAt,body,labels -``` - -Also fetch the latest sprint plan for cross-reference: - -```bash -gh issue list --state open --label "source:planning-agent" --limit 1 --json number,title,body -``` - -### 3. Read ROADMAP - -Read `<%= paths.roadmap %>` to determine the phase structure: - -```bash -cat <%= paths.roadmap %> -``` - -Parse each phase to extract: -- **Phase number and name** -- **Status** (`COMPLETE`, `MOSTLY COMPLETE`, `PARTIALLY COMPLETE`, or `PLANNED`) -- **Total items**: Count of all `- [x]` and `- [ ]` items -- **Completed items**: Count of `- [x]` items -- **Remaining items**: Count of `- [ ]` items with their text - -Identify: -- **Current phase**: First phase not fully `COMPLETE` -- **Next phase**: Phase after current -- **Overall progress**: Total checked / total items across all phases - -### 4. Roadmap Alignment Audit - -For every open issue and every PR merged in the last 14 days, determine which ROADMAP phase it maps to. - -**Mapping heuristic** (in order of confidence): -1. Issue body explicitly references a phase (e.g., "Phase 4", "Plugin Ecosystem") -2. Issue title or body mentions a ROADMAP line item keyword (e.g., "policy pack", "VS Code extension", "replay") -3. Issue labels contain a `phase:N` label -4. Issue file scope paths map to a ROADMAP area (e.g., `packages/events/src/` → Phase 1, `packages/kernel/src/` → Phase 2, `apps/cli/src/` → Phase 3) -5. No mapping found → classify as **orphaned** - -Produce three lists: -- **Aligned**: Issues/PRs that map to a ROADMAP phase -- **Orphaned**: Issues/PRs with no clear ROADMAP mapping -- **Cross-cutting**: Issues that span multiple phases (infrastructure, docs, tooling) - -### 5. Phase Progress Assessment - -For each ROADMAP phase (especially the current and next phases), calculate: - -- **ROADMAP completion**: `checked_items / total_items` as percentage -- **Open issues in this phase**: Count of aligned open issues -- **Closed issues in last 14 days**: Count of aligned recently closed issues -- **Open PRs in this phase**: Count of aligned open PRs -- **Velocity**: Issues closed per week in this phase (last 14 days / 2) -- **Estimated remaining work**: `remaining_items / velocity_per_week` (if velocity > 0) - -Flag: -- **Stalled phases**: Current phase with 0 velocity in last 14 days -- **Phase regression**: If a previously `COMPLETE` phase now has open issues targeting it - -### 6. Issue Quality Assessment - -For each open issue, evaluate quality against these criteria: - -| Criterion | Check | -|-----------|-------| -| **Has description** | Body is non-empty and >50 characters | -| **Has task type label** | Has one of: `task:feature`, `task:bug`, `task:chore`, `task:docs`, `task:test` | -| **Has status label** | Has one of: `status:pending`, `status:in-progress`, `status:blocked` | -| **Has priority label** | Has one of: `priority:critical`, `priority:high`, `priority:medium`, `priority:low` | -| **Has file scope** | Body contains a `## File Scope` section or path references | -| **Has acceptance criteria** | Body contains `## Acceptance Criteria`, `## Done When`, or a checkbox list | -| **Has dependencies documented** | Body contains `## Dependencies` section (if applicable) | - -Score each issue 0-7 based on criteria met. - -Classify: -- **Well-defined** (score 5-7): Ready for autonomous work -- **Needs refinement** (score 3-4): Missing key metadata -- **Underspecified** (score 0-2): Not ready for autonomous implementation - -### 7. Value Drift Detection - -Compare what the sprint plan recommended vs. what actually happened: - -1. Parse the latest sprint plan issue body (from Step 2) to extract the **Recommended Sequence** -2. Compare against recently closed issues and merged PRs -3. Calculate **alignment score**: `recommended_items_completed / total_items_completed` - -Classify drift: -- **On track** (alignment ≥ 70%): Work matches recommendations -- **Minor drift** (alignment 40-69%): Some unplanned work -- **Significant drift** (alignment < 40%): Work diverged from plan - -Also check: -- **Unplanned work ratio**: Issues closed that were NOT in the sprint plan -- **Stale plan items**: Sprint plan items with no progress in 7+ days - -### 8. Feature Completeness Analysis - -For the current ROADMAP phase, group related open issues into feature clusters: - -1. Group by shared file scope paths (issues touching the same `src/` directories) -2. Group by shared keywords in titles/bodies -3. Group by explicit dependency chains - -For each cluster, assess: -- **Is the cluster complete?**: All issues in the cluster are either closed or have open PRs -- **Has gaps?**: Related functionality referenced in ROADMAP but no corresponding issue exists -- **Blocking the phase?**: Unresolved cluster that blocks phase completion - -### 9. Generate Product Health Report - -Compose a structured report in markdown: - -**Header**: -- Generation timestamp (UTC) -- HEAD commit SHA -- Current ROADMAP phase and overall progress percentage - -**Roadmap Progress Dashboard** (table): -| Phase | Status | Progress | Velocity | ETA | -Showing all phases with completion bars and key metrics. - -**Alignment Audit** (table): -| Category | Count | % of Total | -Showing aligned, orphaned, and cross-cutting issue/PR counts. - -**Orphaned Work** (list, max 10): -Issues/PRs with no roadmap mapping — each with a brief recommendation (create roadmap item, reclassify, or close). - -**Phase Health**: -For the current and next phases: -- Items remaining (with text) -- Open issues targeting this phase -- Velocity trend -- Risk flags (stalled, regression) - -**Issue Quality Summary** (table): -| Quality Tier | Count | % | Example Issues | -With aggregate statistics and the bottom 5 worst-scored issues. - -**Value Drift Report**: -- Alignment score with sprint plan -- Unplanned work ratio -- Top 3 recommended items that had no progress - -**Feature Completeness** (current phase): -- Clusters identified -- Gaps found -- Blocking clusters - -**Recommendations** (numbered, max 5): -The top 5 product-level actions that would improve roadmap delivery. Focus on: -1. Issues to create (feature gaps) -2. Issues to refine (underspecified) -3. Orphaned work to address -4. Phase blockers to unblock -5. Quality improvements - -### 10. Route Output (Report Routing Protocol) - -Apply the `report-routing` protocol to determine where output goes: - -**Assess severity**: Check if ANY of the following critical conditions exist: -- Significant value drift detected (alignment score <50%) -- Current phase progress stalled (<10% change over 7 days with active issues) -- Multiple critical feature gaps identified - -**If critical conditions exist → ALERT tier**: - -```bash -gh issue create \ - --title "ALERT: Product health concern — $(date +%Y-%m-%d)" \ - --body "<critical findings with evidence>" \ - --label "source:product-agent" --label "<%= labels.critical %>" --label "<%= labels.pending %>" -``` - -**Otherwise → REPORT tier** (write to local file): - -```bash -mkdir -p .agentguard/reports -cat > .agentguard/reports/product-agent-$(date +%Y-%m-%d).md <<'REPORT_EOF' -<product health report markdown> -REPORT_EOF -``` - -Close any previous product health report issues that are still open: - -```bash -PREV=$(gh issue list --state open --label "source:product-agent" --json number --jq '.[].number' 2>/dev/null) -for num in $PREV; do - gh issue close "$num" --comment "Superseded — reports now written to .agentguard/reports/" 2>/dev/null || true -done -``` - -### 11. Apply Quality Labels - -For issues scoring 0-2 (underspecified), add a label to flag them for human review: - -```bash -gh issue edit <N> --add-label "needs:refinement" -``` - -Cap at **5 label applications per run**. - -Do NOT label issues that already have the `needs:refinement` label. - -### 12. Summary - -Report: -- **Issues analyzed**: N -- **Roadmap overall progress**: N% -- **Current phase progress**: N% (Phase N — Name) -- **Orphaned work found**: N items -- **Issue quality**: N well-defined / N needs-refinement / N underspecified -- **Value drift**: On track | Minor drift | Significant drift (alignment score N%) -- **Feature gaps identified**: N -- **Labels applied**: N -- **Product health report created**: #N -- **Top recommendation**: Brief statement of the single most important product-level action - -## Rules - -- **Routine reports go to `.agentguard/reports/`, NOT GitHub issues** — follow the report-routing protocol -- Create a maximum of **1 alert issue per run** — only for critical product health concerns -- Apply a maximum of **5 quality labels per run** -- **Never close issues** — except previous product health report issues labeled `source:product-agent` (cleanup) -- **Never modify issue bodies** — only add labels -- **Never create work issues** — that is the Backlog Steward's job. Only create the report issue. -- **Never assign issues** — that is the Coder Agent's job via `claim-issue` -- **Never re-prioritize issues** — that is the Planning Agent's job -- If `gh` CLI is not authenticated, report the error and STOP -- If no open issues exist, report "No issues to analyze" and STOP -- Do not re-label issues that already have the `needs:refinement` label -- When closing previous reports, verify the issue is actually labeled `source:product-agent` before closing -- Roadmap alignment mapping should be conservative — only classify as "aligned" when there is clear evidence -- Feature gap identification should be conservative — only flag gaps when a ROADMAP line item clearly has no corresponding issue diff --git a/packages/swarm/templates/skills/progress-controller.md b/packages/swarm/templates/skills/progress-controller.md deleted file mode 100644 index 7410658b..00000000 --- a/packages/swarm/templates/skills/progress-controller.md +++ /dev/null @@ -1,253 +0,0 @@ -# Skill: Progress Controller - -Track roadmap phase completion, validate milestone criteria, detect phase transition readiness, and prevent endless backlog expansion. This agent ensures the swarm converges toward roadmap goals rather than creating unbounded work. Designed for daily scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Read <%= paths.roadmap %> - -```bash -cat <%= paths.roadmap %> -``` - -Parse the roadmap to extract: -- **All phases** with their status (`STABLE`, `IN PROGRESS`, `NEXT`, `PLANNED`) -- **Per-phase items**: Each `- [x]` (completed) and `- [ ]` (incomplete) checkbox item -- **Current active phase**: The first phase that is NOT `STABLE` / `COMPLETE` -- **Next phase**: The phase after the current active one - -### 3. Map Issues to Phases - -Fetch all open issues: - -```bash -gh issue list --state open --limit 100 --json number,title,labels,body -``` - -For each issue, determine its phase alignment: -- Match by title keywords or explicit phase references in the body -- Match by label (e.g., labels containing phase numbers or theme names) -- Match by file scope references (e.g., issues touching `packages/policy/src/` → Phase 8 Policy Ecosystem) - -Build a phase-issue map: -``` -Phase 5 (Editor Integrations): issues #A, #B — 2 open -Phase 6 (Reference Monitor Hardening): issues #C, #D, #E — 3 open -Phase 6.5 (Invariant Expansion): issues #F, #G — 2 open -... -``` - -### 5. Calculate Phase Completion - -For each phase (starting from the current active phase): - -``` -completion_ratio = checked_items / total_items (from <%= paths.roadmap %> checkboxes) -open_issues = count of issues mapped to this phase -closed_issues = count of recently closed issues mapped to this phase -``` - -**Phase completion criteria** (ALL must be true): -1. `completion_ratio >= 0.90` (90% of ROADMAP checkboxes checked) -2. `open_issues <= 1` (at most 1 remaining issue) -3. No open PRs targeting this phase's items -4. No `priority:critical` issues in this phase - -### 6. Detect Phase Transition Readiness - -If the current active phase meets completion criteria: - -1. Log: "Phase N (<theme>) meets completion criteria" -2. Check if the next phase has prerequisites: - - Are there dependency items from the current phase that must complete first? - - Does the next phase require infrastructure not yet in place? -3. If ready for transition, create a phase transition issue: - -```bash -gh issue create \ - --title "milestone: Phase N (<theme>) ready for completion" \ - --body "## Phase Transition Assessment - -**Phase:** N — <theme> -**ROADMAP completion:** <X>/<Y> items checked (<ratio>%) -**Open issues:** <count> -**Open PRs:** <count> - -### Completion Evidence -<list of completed ROADMAP items> - -### Remaining Items -<list of unchecked items, if any — note why they don't block completion> - -### Next Phase -Phase <N+1> — <theme> -**Readiness:** Ready / Blocked by <dependency> - -### Recommendation -- [ ] Mark Phase N as STABLE in <%= paths.roadmap %> -- [ ] Begin Phase <N+1> work - ---- -*Auto-created by progress-controller on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" \ - --label "source:progress-controller" --label "<%= labels.pending %>" -``` - -### 7. Detect Backlog Expansion - -Compare issue creation rate vs. closure rate: - -```bash -# Issues created in last 7 days -gh issue list --state all --json number,createdAt --limit 100 --jq '[.[] | select(.createdAt > "'$(date -u -d '7 days ago' +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || date -u -v-7d +%Y-%m-%dT%H:%M:%SZ)'")]' 2>/dev/null - -# Issues closed in last 7 days -gh issue list --state closed --json number,closedAt --limit 100 --jq '[.[] | select(.closedAt > "'$(date -u -d '7 days ago' +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || date -u -v-7d +%Y-%m-%dT%H:%M:%SZ)'")]' 2>/dev/null -``` - -**Condition: Backlog expanding** — Issues created (7d) > 2x issues closed (7d) AND total open issues > 30. - -**Response:** -1. Log the expansion rate in the progress report -2. Check which agent is creating the most issues: - ```bash - gh issue list --state open --label "source:backlog-steward" --json number --jq length - ``` -3. If Backlog Steward is the primary source and expansion is >2x, recommend pausing issue creation in the progress report - -### 8. Detect Stale Phases - -For phases marked `IN PROGRESS` or `NEXT`: - -**Condition: Phase stalled** — No ROADMAP checkboxes changed AND no issues closed for this phase in 14+ days. - -**Response:** -1. Log the stall in the progress report -2. Analyze why: - - Are all issues for this phase blocked? - - Is the Coder Agent working on unrelated issues? - - Has the Planning Agent deprioritized this phase? - -### 9. Update Swarm State - -Read current swarm-state.json: - -```bash -cat <%= paths.swarmState %> 2>/dev/null || echo '{}' -``` - -Update with phase tracking data: -- `currentPhase`: the active phase name/number derived from <%= paths.roadmap %> -- `phaseCompletion`: object with `{ phase: string, checked: number, total: number, ratio: number }` -- `nextPhase`: the next phase name/number -- `backlogHealth`: `{ openIssues: number, createdLast7d: number, closedLast7d: number, expansionRate: number }` -- `lastProgressRun`: current ISO timestamp - -Preserve all other fields. - -```bash -mkdir -p .agentguard -# Write updated swarm-state.json -``` - -### 10. Generate Progress Report - -Check if a previous progress report exists: - -```bash -gh issue list --state open --label "source:progress-controller" --json number --jq '.[0].number' 2>/dev/null -``` - -If a previous report exists (and is NOT a milestone issue), close it: - -```bash -gh issue close <PREV_NUMBER> --comment "Superseded by new progress report." -``` - -Create the new report: - -```bash -gh issue create \ - --title "Progress Report — $(date +%Y-%m-%d)" \ - --body "<progress report markdown>" \ - --label "source:progress-controller" --label "<%= labels.pending %>" -``` - -**Report format:** - -```markdown -## Progress Controller Report - -**Timestamp:** <UTC> -**Active Phase:** Phase N — <theme> - -### Phase Completion Matrix - -| Phase | Status | Checked | Total | Completion | Open Issues | -|-------|--------|---------|-------|------------|-------------| -| Phase 5 | IN PROGRESS | X | Y | Z% | N | -| Phase 6 | NEXT | X | Y | Z% | N | -| ... | | | | | | - -### Phase Transition Readiness - -- Phase N: <Ready for completion / Not ready — X items remaining> - -### Backlog Health - -| Metric | Value | Status | -|--------|-------|--------| -| Total open issues | N | HEALTHY/WARNING | -| Created (7d) | N | | -| Closed (7d) | N | | -| Expansion rate | Nx | HEALTHY/WARNING/CRITICAL | - -### Stalled Phases - -<list any phases with no progress in 14+ days> - -### Recommendations - -1. <top recommendation> -2. <second recommendation> -``` - -### 11. Summary - -Report: -- **Active phase**: Phase N — <theme> (<completion>% complete) -- **Phase transition ready**: Yes/No -- **Backlog expansion rate**: Nx (healthy/warning/critical) -- **Stalled phases**: N -- **Progress report created**: #N -- **Milestone issues created**: N -- **Top recommendation**: Brief statement - -## Rules - -- Create a maximum of **1 progress report per run** -- Create a maximum of **1 milestone/transition issue per run** -- **NEVER modify <%= paths.roadmap %>** — only report findings and create milestone issues -- **NEVER modify CLAUDE.md** — that is the Documentation Maintainer's job -- **NEVER close issues** — only close previous progress report issues labeled `source:progress-controller` -- **NEVER create work issues** — that is the Backlog Steward's job -- If `gh` CLI is not authenticated, report the error and STOP -- Phase completion assessment should be conservative — only declare ready when clearly met -- Backlog expansion warnings should consider the ROADMAP scope — some expansion is natural during phase transitions diff --git a/packages/swarm/templates/skills/recovery-controller.md b/packages/swarm/templates/skills/recovery-controller.md deleted file mode 100644 index 642e59cf..00000000 --- a/packages/swarm/templates/skills/recovery-controller.md +++ /dev/null @@ -1,318 +0,0 @@ -# Skill: Recovery Controller - -Detect unhealthy swarm conditions and execute remediation playbooks to drive the autonomous SDLC back to a healthy state. This is the self-healing layer — the Kubernetes controller-manager equivalent for the agent swarm. Designed for periodic scheduled execution (every 4 hours). - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest remediation** in every ambiguous situation -- When in doubt: **observe and report** rather than take corrective action - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Design Principle: Reconciliation Loop - -``` -Observe actual state (GitHub, CI, swarm-state.json, agent outputs) - | -Compare to desired state (healthy thresholds) - | -If unhealthy: execute remediation playbook - | -Verify remediation succeeded - | -Update swarm-state.json with recovery actions taken -``` - -The Recovery Controller NEVER duplicates other agents' work. It only intervenes when an agent has failed to do its job or when system-level conditions prevent normal operation. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Read Swarm State - -```bash -cat <%= paths.swarmState %> 2>/dev/null || echo '{}' -``` - -Extract current state fields: `mode`, `openAgentPRs`, `prQueueHealthy`, `mergeConflicts`, `blockers`, `lastUpdated`, `recoveryActions`. - -### 3. Health Check: PR Queue - -```bash -gh pr list --state open --json number,title,headRefName,mergeable,createdAt,labels,isDraft --limit 30 -``` - -**Condition: PR queue stuck** — 5+ open non-draft PRs AND the oldest is >48 hours old with no activity. - -**Remediation playbook:** -1. Identify PRs with `merge:failed` label or `CONFLICTING` mergeable state -2. For PRs older than 7 days with conflicts: - ```bash - gh pr comment <NUMBER> --body "Recovery Controller: This PR has been conflicting for 7+ days. Closing to unblock the queue. The underlying issue remains open for a fresh attempt." - gh pr close <NUMBER> - ``` -3. Cap at **2 PR closures per run** -4. Do NOT close PRs with `do-not-merge` label (these are intentionally held) - -### 4. Health Check: CI on Main - -```bash -gh run list --branch main --limit 5 --json databaseId,conclusion,createdAt,name -``` - -**Condition: CI broken on main** — Last 3 runs on main all have conclusion `failure`. - -**Remediation playbook:** -1. Check if a CI Triage issue already exists: - ```bash - gh issue list --state open --label "source:ci-triage" --json number,title --limit 5 - ``` -2. If no triage issue exists, create one: - ```bash - gh issue create \ - --title "fix(ci): Main branch CI broken — recovery controller alert" \ - --body "## CI Recovery Alert - - The last 3 CI runs on main have failed. This blocks all PR merges and agent work. - - **Failed runs:** - <list run IDs and failure reasons> - - **Priority:** CRITICAL — this blocks the entire SDLC pipeline. - - --- - *Auto-created by recovery-controller on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" \ - --label "<%= labels.critical %>" --label "source:recovery-controller" --label "<%= labels.pending %>" - ``` -3. If triage issue exists but is >48h old, add a comment escalating urgency - -### 5. Health Check: Agent Liveness - -Check that key agents have produced recent output: - -```bash -# Planning Agent — should produce a sprint plan daily -gh issue list --state open --label "source:planning-agent" --limit 1 --json number,createdAt - -# Observability Agent — should produce a report daily -gh issue list --state open --label "source:observability-agent" --limit 1 --json number,createdAt - -# Backlog Steward — should produce issues regularly -gh issue list --label "source:backlog-steward" --limit 1 --json number,createdAt --state all -``` - -**Condition: Agent silent** — An agent's last output is >72 hours old (3x its expected frequency). - -**Remediation playbook:** -1. Log the silent agent in the recovery report -2. Check if the agent's scheduled task is still enabled: - - Look for recent issues/PRs from the agent -3. Create an alert issue if the agent has been silent >72h: - ```bash - gh issue create \ - --title "alert: <agent-name> silent for 72+ hours" \ - --body "## Agent Liveness Alert - - **Agent:** <agent-name> - **Last output:** <timestamp or 'none found'> - **Expected frequency:** <frequency> - **Impact:** <what stops working when this agent is down> - - --- - *Auto-created by recovery-controller on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" \ - --label "source:recovery-controller" --label "<%= labels.high %>" --label "<%= labels.pending %>" - ``` -3. Cap at **1 liveness alert per run** - -### 6. Health Check: Merge Conflict Cascade - -```bash -gh pr list --state open --json number,mergeable --jq '[.[] | select(.mergeable == "CONFLICTING")] | length' -``` - -**Condition: Conflict cascade** — 4+ PRs in CONFLICTING state simultaneously. - -**Remediation playbook:** -1. List all conflicting PRs sorted by age (oldest first) -2. Close PRs 4+ (keeping only the 3 oldest for the Merge Conflict Resolver to process) -3. Comment on closed PRs: - ```bash - gh pr comment <NUMBER> --body "Recovery Controller: Closing to break merge conflict cascade. 4+ PRs were conflicting simultaneously, blocking the pipeline. The underlying issue remains open for a fresh implementation attempt." - gh pr close <NUMBER> - ``` -4. Cap at **2 cascade closures per run** - -### 7. Health Check: Swarm State Freshness - -Check if `swarm-state.json` has been updated recently: - -**Condition: Stale state** — `lastUpdated` is >48 hours ago. - -**Remediation playbook:** -1. If the Planning Agent or Observability Agent should have updated it, flag their silence (Step 5 handles this) -2. Reset `lastUpdated` to now and add a `blockers` entry noting the staleness -3. Do NOT modify other fields — preserve whatever other agents have written - -### 8. Health Check: Roadmap Progress Stall - -```bash -# Check if any issues have been closed in the last 7 days -gh issue list --state closed --json closedAt --jq '[.[] | select(.closedAt > "'$(date -u -d '7 days ago' +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || date -u -v-7d +%Y-%m-%dT%H:%M:%SZ)'")]' 2>/dev/null | head -5 -``` - -**Condition: No progress** — Zero issues closed AND zero PRs merged in the last 7 days. - -**Remediation playbook:** -1. Check for blockers: - - All PRs have failing CI? → CI is the bottleneck (Step 4) - - All PRs have conflicts? → Conflict cascade (Step 6) - - No open PRs at all? → Coder Agent may be stalled (Step 5) - - PR queue at max (5+)? → Merge pipeline is the bottleneck -2. Log the root cause analysis in the recovery report -3. If bottleneck is identifiable and fixable, take the appropriate remediation -4. If bottleneck is not clear, create an alert issue for human review - -### 9. Determine System Mode - -Based on health checks, determine the appropriate system mode: - -| Condition | Mode | -|-----------|------| -| All checks healthy | `normal` | -| 1-2 WARNING conditions | `normal` (with warnings logged) | -| CI broken on main OR conflict cascade OR 2+ silent agents | `conservative` | -| CI broken on main AND (conflict cascade OR 3+ silent agents OR no progress 14+ days) | `safe` | - -**Mode behaviors:** -- **normal**: All agents operate at full autonomy -- **conservative**: Coder Agent reduces to 1 PR max, Backlog Steward pauses new issue creation, PR Merger requires 1+ human review -- **safe**: Only Observability Agent and Recovery Controller run. All other agents should check mode and skip. - -### 10. Update Swarm State - -Read the current `swarm-state.json` and update: - -```bash -cat <%= paths.swarmState %> 2>/dev/null || echo '{}' -``` - -Update fields: -- `mode`: `normal` | `conservative` | `safe` -- `lastRecoveryRun`: current ISO timestamp -- `recoveryActions`: array of actions taken this run (e.g., `{"action": "closed-stale-pr", "target": "#123", "reason": "conflicting 7+ days"}`) -- `blockers`: array of current blockers with descriptions -- `healthChecks`: object with results of each health check - -Preserve all other fields written by other agents. - -```bash -mkdir -p .agentguard -# Write updated swarm-state.json -``` - -### 11. Generate Recovery Report - -If any remediation actions were taken OR any WARNING/CRITICAL conditions detected, route the output using the `report-routing` protocol: - -**If remediation actions were taken or CRITICAL conditions exist → write REPORT file AND create ALERT if critical**: - -```bash -mkdir -p .agentguard/reports -cat > .agentguard/reports/recovery-controller-$(date +%Y-%m-%d).md <<'REPORT_EOF' -<recovery report markdown> -REPORT_EOF -``` - -Only create an ALERT issue if CRITICAL conditions required human attention (e.g., system entered `safe` mode, or remediation failed): - -```bash -gh issue create \ - --title "ALERT: Recovery action required — $(date +%Y-%m-%d)" \ - --body "<critical recovery details>" \ - --label "source:recovery-controller" --label "<%= labels.critical %>" --label "<%= labels.pending %>" -``` - -Close any previous routine recovery report issues: - -```bash -PREV=$(gh issue list --state open --label "source:recovery-controller" --json number,labels --jq '[.[] | select(.labels | map(.name) | index("<%= labels.critical %>") | not)] | .[].number' 2>/dev/null) -for num in $PREV; do - gh issue close "$num" --comment "Superseded — recovery reports now written to .agentguard/reports/" 2>/dev/null || true -done -``` - -**If all health checks passed and no remediation needed → LOG tier**: - -```bash -mkdir -p .agentguard/logs -echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) [recovery-controller] All healthy. Mode: $(mode). No remediation needed." >> .agentguard/logs/swarm.log -``` - -**Report format:** -```markdown -## Recovery Controller Report - -**Timestamp:** <UTC> -**System Mode:** normal | conservative | safe - -### Health Check Results - -| Check | Status | Details | -|-------|--------|---------| -| PR Queue | HEALTHY/WARNING/CRITICAL | N open PRs, oldest Nh | -| CI on Main | HEALTHY/WARNING/CRITICAL | Last N runs: pass/fail | -| Agent Liveness | HEALTHY/WARNING/CRITICAL | N agents responsive | -| Merge Conflicts | HEALTHY/WARNING/CRITICAL | N conflicting PRs | -| Swarm State | HEALTHY/WARNING/CRITICAL | Last updated Nh ago | -| Roadmap Progress | HEALTHY/WARNING/CRITICAL | N issues closed (7d) | - -### Remediation Actions Taken - -| Action | Target | Reason | -|--------|--------|--------| -| <action> | <PR/issue> | <reason> | - -### Mode Determination - -Current mode: <mode> -Reason: <why this mode was selected> -``` - -### 12. Summary - -Report: -- **Health checks run**: 6 -- **Conditions detected**: N (N critical, N warning) -- **Remediation actions taken**: N (list actions) -- **System mode**: normal | conservative | safe (changed from <previous>?) -- **PRs closed**: N -- **Alert issues created**: N -- **Top concern**: Brief statement of the most critical finding - -## Rules - -- **Routine recovery reports go to `.agentguard/reports/`, NOT GitHub issues** — follow the report-routing protocol -- Create a maximum of **1 alert issue per run** — only for CRITICAL conditions requiring human attention -- Close a maximum of **2 stale/stuck PRs per run** -- Close a maximum of **2 cascade PRs per run** -- **NEVER close PRs with `do-not-merge` label** -- **NEVER force push or modify branches** — only close PRs or create issues -- **NEVER modify source code** — only manage GitHub issues and PRs -- **NEVER override other agents' decisions** — only intervene when agents have failed -- **NEVER escalate to `safe` mode without at least 2 CRITICAL conditions** -- If `gh` CLI is not authenticated, report the error and STOP -- The Recovery Controller is the ONLY agent that can set the system `mode` field in swarm-state.json -- When closing PRs, always verify the underlying issue remains open for retry -- Do not create duplicate alert issues — check for existing ones first -- Remediation actions should be minimal and targeted — do the least necessary to unblock the pipeline diff --git a/packages/swarm/templates/skills/release-prepare.md b/packages/swarm/templates/skills/release-prepare.md deleted file mode 100644 index c0298bb0..00000000 --- a/packages/swarm/templates/skills/release-prepare.md +++ /dev/null @@ -1,172 +0,0 @@ -# Skill: Release Prepare - -Prepare a new release: validate the codebase, assess governance readiness, generate a changelog from merged PRs, bump the version, and create a release-candidate issue for human approval. Designed for manual invocation when the maintainer decides to release. - -## Prerequisites - -Run `start-governance-runtime` first. All release operations must be governed. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active. If governance cannot be activated, STOP. - -### 2. Determine Current Version - -```bash -node -e "console.log(require('./package.json').version)" -``` - -### 3. Determine Version Bump - -Check merged PRs since the last release to determine the appropriate semver bump: - -```bash -gh release list --limit 1 --json tagName --jq '.[0].tagName' -``` - -```bash -gh pr list --state merged --base main --json title,labels,mergedAt --limit 50 --jq '[.[] | select(.mergedAt > "<last-release-date>")]' -``` - -Determine bump type from PR titles/labels: -- Any PR with `breaking` label or title containing "BREAKING" → **major** -- Any PR with `feat` prefix or `enhancement` label → **minor** -- All else (fix, chore, docs, refactor) → **patch** - -### 4. Run Full Test Suite - -Invoke the `full-test` skill. ALL 7 checks must pass. If any check fails, STOP — do not proceed with a broken release. - -### 5. Assess Governance Readiness - -Run the analytics engine to assess governance health for the release period: - -```bash -<%= paths.cli %> analytics --format json 2>/dev/null | head -100 -``` - -Extract: -- **Risk score** (0-100) and **risk level** (low / medium / high / critical) -- **Total violations** across sessions since the last release -- **Unresolved invariant violations** (any that recurred without resolution) -- **Denial trends** (increasing, stable, or decreasing) - -Also check decision records for the release period: - -```bash -cat .agentguard/decisions/*.jsonl 2>/dev/null | wc -l -cat .agentguard/decisions/*.jsonl 2>/dev/null | grep -c '"outcome":"deny"' || echo 0 -``` - -If risk level is **critical**, warn in the release candidate issue but do NOT block the release — that is the human's decision. - -If analytics is not available, note "Governance analytics: not available" and proceed with basic telemetry counts. - -### 6. Generate Changelog - -From the merged PRs since last release, generate a changelog: - -``` -## What's Changed - -### Features -- <PR title> (#<number>) by @<author> - -### Bug Fixes -- <PR title> (#<number>) by @<author> - -### Maintenance -- <PR title> (#<number>) by @<author> - -**Full Changelog**: <compare-url> -``` - -Group PRs by prefix: `feat` → Features, `fix` → Bug Fixes, everything else → Maintenance. - -### 7. Bump Version - -```bash -pnpm version <patch|minor|major> --no-git-tag-version -``` - -Stage the version change: - -```bash -git add package.json package-lock.json -git commit -m "chore: bump version to <new-version>" -``` - -### 8. Capture Governance Decision - -Record the governance decision for the version bump: - -```bash -<%= paths.cli %> inspect --last --decisions 2>/dev/null -``` - -### 9. Create Release Branch - -```bash -git checkout -b release/v<new-version> -git push -u origin release/v<new-version> -``` - -### 10. Create Release Candidate Issue - -Ensure labels exist: - -```bash -gh label create "source:release-agent" --color "0E8A16" --description "Auto-created by Release Agent" 2>/dev/null || true -gh label create "release-candidate" --color "FBCA04" --description "Pending human approval for release" 2>/dev/null || true -``` - -Create the tracking issue: - -```bash -gh issue create \ - --title "Release v<new-version> — awaiting approval" \ - --body "<changelog + test results + governance readiness + release checklist>" \ - --label "source:release-agent" --label "release-candidate" -``` - -The issue body should include: -- Version: `<old-version>` → `<new-version>` (<bump-type>) -- Changelog (from step 6) -- Test results summary (from step 4) -- **Governance Readiness** section: - - Risk score: <N>/100 (<risk level>) - - Total governance decisions: <N> - - Denials: <N> - - Invariant violations: <N> - - Escalation events: <N> - - Denial trend: <increasing/stable/decreasing> -- Release checklist: - - [ ] Changelog reviewed - - [ ] Version number correct - - [ ] All tests passing - - [ ] Governance risk acceptable - - [ ] Ready to publish - -### 11. Summary - -Report: -- **Version**: `<old>` → `<new>` (<bump-type>) -- **PRs included**: N -- **Tests**: all passing / N failures -- **Governance risk**: <risk level> (score: <N>/100) -- **Branch**: `release/v<new-version>` pushed -- **Issue**: #<N> created — awaiting human approval -- **Next step**: Human approves → run `release-publish` skill - -## Rules - -- **Never create a GitHub Release directly** — that triggers npm publish. Only create the tracking issue. -- **Never publish to npm** — that is the `release-publish` skill's job after human approval. -- **Never force-push** the release branch. -- If the full test suite fails, STOP and report — do not prepare a broken release. -- If there are no merged PRs since the last release, report "Nothing to release" and STOP. -- If `gh` CLI is not authenticated, STOP — release preparation requires GitHub access. -- If governance risk is **critical**, include a prominent warning in the release candidate issue but do not block — the human decides. -- The release candidate issue MUST be reviewed and approved by a human before proceeding to `release-publish`. diff --git a/packages/swarm/templates/skills/release-publish.md b/packages/swarm/templates/skills/release-publish.md deleted file mode 100644 index 1eb5fee0..00000000 --- a/packages/swarm/templates/skills/release-publish.md +++ /dev/null @@ -1,115 +0,0 @@ -# Skill: Release Publish - -Publish a release after human approval. Creates a GitHub Release (which triggers the `publish.yml` workflow for npm publication), posts release notes, and closes the tracking issue. Only run after `release-prepare` and human approval. - -## Prerequisites - -- Run `start-governance-runtime` first. All release operations must be governed. -- A release candidate issue with label `release-candidate` must exist and be approved by a human. -- The release branch `release/v<version>` must exist. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active. If governance cannot be activated, STOP. - -### 2. Verify Approval - -Find the open release candidate issue: - -```bash -gh issue list --state open --label "release-candidate" --json number,title --limit 1 -``` - -If no open release candidate issue exists, STOP — nothing to publish. - -Check that the issue has been approved (human has commented with approval or checked all checklist items): - -```bash -gh issue view <ISSUE_NUMBER> --json body,comments -``` - -Look for: -- All checklist items checked (`[x]`) -- Or a comment containing "approved", "lgtm", or "ship it" - -If not approved, report "Release candidate #<N> has not been approved yet" and STOP. - -### 3. Verify Release Branch - -```bash -git fetch origin -git checkout release/v<version> -git log --oneline -1 -``` - -Verify the branch exists and the latest commit is the version bump. - -### 4. Final Test Verification - -Run the full test suite one more time on the release branch: - -```bash -pnpm build && pnpm test -``` - -If any test fails, STOP — do not publish a broken release. - -### 5. Merge Release Branch - -Create a PR from the release branch to main and merge it: - -```bash -gh pr create --base main --head release/v<version> --title "Release v<version>" --body "Merges release v<version>. See #<issue-number> for changelog." -``` - -Wait for CI to pass, then merge: - -```bash -gh pr merge --squash --auto -``` - -### 6. Create GitHub Release - -Extract the changelog from the release candidate issue body, then create the release: - -```bash -gh release create v<version> --target main --title "v<version>" --notes "<changelog>" -``` - -This triggers the `publish.yml` GitHub Actions workflow which handles npm publication with provenance. - -### 7. Close Tracking Issue - -```bash -gh issue close <ISSUE_NUMBER> --comment "Released as v<version>. npm publish triggered via GitHub Actions." -``` - -### 8. Clean Up Release Branch - -```bash -git checkout main -git pull origin main -git branch -d release/v<version> -git push origin --delete release/v<version> -``` - -### 9. Summary - -Report: -- **Version published**: v<version> -- **GitHub Release**: <release-url> -- **npm publish**: triggered via `publish.yml` workflow -- **Issue**: #<N> closed -- **Branch**: `release/v<version>` cleaned up - -## Rules - -- **Never publish without human approval** — the release candidate issue must be approved first. -- **Never run `npm publish` directly** — always use `gh release create` which triggers the CI workflow with provenance. -- **Never force-push** to main or the release branch. -- If the release candidate issue is not approved, STOP immediately. -- If tests fail on the release branch, STOP — close the release candidate issue with a failure comment. -- If `gh` CLI is not authenticated, STOP — publishing requires GitHub access. -- If the `publish.yml` workflow fails after release creation, report the failure but do not retry — the maintainer should investigate. diff --git a/packages/swarm/templates/skills/repo-hygiene.md b/packages/swarm/templates/skills/repo-hygiene.md deleted file mode 100644 index a572d2b6..00000000 --- a/packages/swarm/templates/skills/repo-hygiene.md +++ /dev/null @@ -1,172 +0,0 @@ -# Skill: Repo Hygiene - -Run nightly repository hygiene: detect stale issues, identify already-solved issues, surface undiscovered work from code annotations, and suggest backlog improvements. Creates or updates a hygiene report issue. Designed for periodic scheduled execution. - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. Requires `gh` CLI authenticated. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active. If governance cannot be activated, STOP. - -### 2. Detect Stale Issues - -Find open issues with no activity in the last 30 days: - -```bash -gh issue list --state open --json number,title,labels,updatedAt --limit 100 -``` - -Filter for issues where `updatedAt` is more than 30 days ago. Exclude issues with labels `pinned`, `epic`, or `release-candidate`. - -For each stale issue, check if it has been addressed by a recent merged PR: - -```bash -gh pr list --state merged --search "<issue-title-keywords>" --json number,title,mergedAt --limit 5 -``` - -Categorize stale issues: -- **Likely solved**: a merged PR matches the issue title/keywords -- **Abandoned**: no matching PR and no recent comments -- **Blocked**: has a "blocked" label or dependency comment - -### 3. Detect Solved Issues - -Check open issues against recently merged PRs to find issues that were fixed but never closed: - -```bash -gh pr list --state merged --base main --json number,title,body --limit 30 -``` - -For each merged PR, extract referenced issue numbers from: -- PR body (patterns: `fixes #N`, `closes #N`, `resolves #N`) -- PR title (pattern: `issue-N`, `#N`) - -Check if those referenced issues are still open: - -```bash -gh issue view <N> --json state --jq '.state' -``` - -### 4. Surface Code Annotations - -Scan the codebase for undiscovered work items: - -```bash -grep -rn "TODO\|FIXME\|HACK\|XXX\|WORKAROUND" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist | head -50 -``` - -Cross-reference each annotation against open issues: - -```bash -gh issue list --state open --json number,title --limit 200 -``` - -Flag annotations that have no corresponding open issue as "undiscovered work." - -Also check `<%= paths.roadmap %>` for items not yet tracked as issues: - -```bash -cat <%= paths.roadmap %> 2>/dev/null -``` - -### 5. Identify Missing Test Coverage - -List source files without corresponding test files: - -```bash -find packages/ apps/ -name "*.ts" -not -name "*.test.ts" -not -path "*/node_modules/*" -not -path "*/dist/*" 2>/dev/null -find packages/ apps/ -name "*.test.ts" -not -path "*/node_modules/*" 2>/dev/null -``` - -For each source file in `packages/*/src/` or `apps/*/src/`, check if a corresponding test file exists in the same package's `tests/` directory. Flag source files with no test coverage as gaps. - -### 6. Generate Hygiene Report - -Compile findings: - -``` -## Repo Hygiene Report - -**Date**: <timestamp> - -### Stale Issues (no activity >30 days) - -| # | Title | Last Updated | Status | -|---|-------|-------------|--------| -| <N> | <title> | <date> | Likely solved / Abandoned / Blocked | - -### Likely Solved (open but fixed by merged PR) - -| Issue # | Issue Title | Fixing PR | Merged | -|---------|-------------|-----------|--------| -| <N> | <title> | #<PR> | <date> | - -### Undiscovered Work (code annotations without issues) - -| File:Line | Annotation | Text | -|-----------|-----------|------| -| <file>:<line> | TODO/FIXME/HACK | <text> | - -### Missing Test Coverage - -| Source File | Expected Test File | Status | -|-------------|-------------------|--------| -| <src/path> | <tests/ts/path> | Missing | - -### Recommendations - -<Actionable suggestions: close solved issues, investigate stale issues, create issues for annotations> -``` - -### 7. Create or Update Hygiene Issue - -Check for an existing hygiene issue: - -```bash -gh issue list --state open --label "source:hygiene-agent" --json number,title --limit 1 -``` - -Ensure labels exist: - -```bash -gh label create "source:hygiene-agent" --color "BFD4F2" --description "Auto-created by Repo Hygiene Agent" 2>/dev/null || true -``` - -If an existing issue is open, update it: - -```bash -gh issue comment <ISSUE_NUMBER> --body "<hygiene report>" -``` - -If no existing issue is open and there are actionable findings, create one: - -```bash -gh issue create \ - --title "repo-hygiene: <N> stale issues, <N> likely solved, <N> undiscovered items" \ - --body "<full hygiene report>" \ - --label "source:hygiene-agent" --label "<%= labels.medium %>" -``` - -### 8. Summary - -Report: -- **Stale issues**: N (N likely solved, N abandoned, N blocked) -- **Solved but open**: N issues -- **Undiscovered work**: N code annotations without issues -- **Missing test coverage**: N source files -- **Issue**: created/updated/none needed -- If clean: "Repository hygiene nominal — no action needed" - -## Rules - -- **Never close, modify, or delete issues** — only report findings and create hygiene tracking issues. -- **Never modify source code** — only read and analyze. -- **Never delete branches** — that is the `stale-branch-janitor` skill's job. -- Cap annotation scanning at 50 results to avoid excessive processing. -- If no actionable findings, report "Repository hygiene nominal" and STOP — do not create an issue. -- If `gh` CLI is not authenticated, still generate the report to console but skip issue creation. -- Stale threshold is 30 days — do not flag recently updated issues. diff --git a/packages/swarm/templates/skills/report-routing.md b/packages/swarm/templates/skills/report-routing.md deleted file mode 100644 index 0c6ce8f0..00000000 --- a/packages/swarm/templates/skills/report-routing.md +++ /dev/null @@ -1,95 +0,0 @@ -# Skill: Report Routing - -Shared routing logic for all reporting agents. Determines where output goes based on severity of findings. This skill is NOT invoked directly — it defines the routing protocol that other reporting skills reference. - -## Routing Tiers - -### Tier 1 — ALERT (GitHub Issue) - -Create a GitHub issue ONLY when findings require human attention or action. - -**Triggers**: -- Any test failures detected -- CI broken on main branch -- LOCKDOWN event triggered -- Risk score >50 -- Security vulnerability found (critical/high severity) -- Agent attempted unauthorized action (governance violation) -- Deadlock or livelock detected in swarm -- Anomaly with severity CRITICAL - -**Format**: Use `gh issue create` with appropriate `source:<agent>` and priority labels. Cap at 1 alert issue per run. - -### Tier 2 — REPORT (Local File) - -Write a markdown report to `.agentguard/reports/` for routine scheduled output that provides value but does not require immediate action. - -**Triggers**: -- Routine health reports (test health, observability, product health) -- Sprint plans and progress updates -- Governance audit summaries with no critical findings -- Risk assessments at NORMAL or ELEVATED level -- Recovery controller reports with no remediation needed - -**Format**: Write to `.agentguard/reports/<agent-id>-<YYYY-MM-DD>.md`. Create the directory if it doesn't exist. Overwrite the same-day file if re-run (idempotent). - -```bash -mkdir -p .agentguard/reports -cat > .agentguard/reports/<agent-id>-$(date +%Y-%m-%d).md <<'REPORT_EOF' -# <Report Title> — <date> -<report content> -REPORT_EOF -``` - -### Tier 3 — LOG (Append to Log) - -Append a single summary line for runs that found nothing actionable. - -**Triggers**: -- "No anomalies detected" -- "Backlog clean — no new items" -- "All agents healthy" -- "No test failures" -- Run completed with no findings above INFO level - -**Format**: Append one line to `.agentguard/logs/swarm.log`: - -```bash -mkdir -p .agentguard/logs -echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) [<agent-id>] <one-line summary>" >> .agentguard/logs/swarm.log -``` - -## Routing Decision Process - -Every reporting skill should follow this decision process BEFORE publishing: - -``` -1. Assess severity of ALL findings -2. If ANY finding is CRITICAL → ALERT tier (create issue) -3. Else if findings contain actionable content → REPORT tier (write file) -4. Else → LOG tier (append line) -``` - -**Important**: A single run may produce BOTH an ALERT issue (for critical findings) AND a REPORT file (for the full report). The alert is the signal; the report is the record. - -## Superseding Previous Reports - -When using ALERT tier, check for and close previous report issues from the same agent: - -```bash -PREV=$(gh issue list --state open --label "source:<agent-id>" --json number --jq '.[0].number' 2>/dev/null) -if [ -n "$PREV" ] && [ "$PREV" != "null" ]; then - gh issue close "$PREV" --comment "Superseded by new report." 2>/dev/null || true -fi -``` - -Only close issues with the EXACT `source:<agent-id>` label. Never close alert issues labeled with priority:critical — those stay open until resolved. - -## Rules - -- ALERT-tier issues MUST include `source:<agent-id>` label for tracking -- REPORT-tier files overwrite same-day files (one file per agent per day) -- LOG-tier entries are single lines, never multi-line -- Never create a GitHub issue for routine, non-actionable reports -- When in doubt between ALERT and REPORT, choose REPORT (conservative) -- When in doubt between REPORT and LOG, choose REPORT (preserve data) diff --git a/packages/swarm/templates/skills/repository-maintenance.md b/packages/swarm/templates/skills/repository-maintenance.md deleted file mode 100644 index f40bfda8..00000000 --- a/packages/swarm/templates/skills/repository-maintenance.md +++ /dev/null @@ -1,354 +0,0 @@ -# Skill: Repository Maintenance - -Consolidated housekeeping skill that scans for code annotations, detects stale/solved issues, manages abandoned PRs and branches, and cross-references findings against existing issues and the ROADMAP. Uses governance analytics to prioritize findings by risk. Replaces the overlapping concerns of `backlog-steward`, `repo-hygiene`, and `stale-branch-janitor` in a single scheduled pass. Designed for periodic scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If data is unavailable or ambiguous, proceed with available data and note limitations -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation (skip > act) -- When in doubt about closing a PR, **warn instead of close** - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. Requires `gh` CLI authenticated with repo access. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Ensure Labels Exist - -```bash -gh label create "source:repo-maintenance" --color "C5DEF5" --description "Auto-created by Repository Maintenance skill" 2>/dev/null || true -gh label create "stale" --color "EDEDED" --description "No activity for 7+ days" 2>/dev/null || true -gh label create "source:backlog-steward" --color "C5DEF5" --description "Auto-created by Backlog Steward skill" 2>/dev/null || true -``` - -### 3. Collect Governance Context - -Read governance analytics to prioritize maintenance actions: - -```bash -<%= paths.cli %> analytics --format json 2>/dev/null | head -50 -``` - -Extract: -- **Current escalation level**: NORMAL / ELEVATED / HIGH / LOCKDOWN -- **Risk score**: current session risk level -- **Top violation patterns**: recurring issues that might indicate stale or blocked work - -If analytics is not available, proceed with standard maintenance. - -### 4. Fetch Repository State - -Fetch all open issues and PRs for cross-referencing: - -```bash -gh issue list --state open --limit 100 --json number,title,body,labels,updatedAt -gh pr list --state open --json number,title,headRefName,updatedAt,labels,author --limit 100 -gh pr list --state merged --base main --json number,title,body,mergedAt --limit 30 -``` - ---- - -## Phase A: Code Annotation Scan (from backlog-steward) - -### 5. Scan Code Annotations - -Search the codebase for TODO, FIXME, HACK, and XXX comments: - -```bash -grep -rn "TODO\|FIXME\|HACK\|XXX\|WORKAROUND" packages/ apps/ tests/ --include="*.ts" --include="*.js" --exclude-dir=node_modules --exclude-dir=dist | head -50 -``` - -For each match, extract: -- **File path** and **line number** -- **Annotation type** (TODO, FIXME, HACK, XXX) -- **Description text** (the rest of the line after the annotation keyword) - -### 6. Scan ROADMAP Unchecked Items - -Read `<%= paths.roadmap %>` and extract all unchecked items: - -```bash -grep -n "\- \[ \]" <%= paths.roadmap %> -``` - -For each match, extract the item description and its parent section (Phase name). - -### 7. Deduplicate Annotations Against Issues - -For each discovered annotation or ROADMAP item, check whether an open issue already covers it: - -- Compare the annotation description against each open issue title and body -- A match exists if the issue title or body contains the key phrase from the annotation (case-insensitive substring match) -- Also match if the file path and line reference appear in any open issue body -- Also check against ROADMAP items to avoid creating issues for tracked work -- If a match is found, skip the item — do NOT create a duplicate - -### 8. Create Issues for New Annotations - -For each unmatched item (up to **5 per run**), create a GitHub issue: - -```bash -gh issue create \ - --title "<type>: <description>" \ - --body "## Source - -- **Type**: <TODO|FIXME|HACK|ROADMAP> -- **Location**: \`<file>:<line>\` (or <%= paths.roadmap %> section) -- **Original text**: <annotation text> - -## Task Description - -<Expanded description of what needs to be done based on the annotation context> - ---- -*Discovered by repository-maintenance on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" \ - --label "source:backlog-steward" --label "<%= labels.pending %>" -``` - -Add a task type label based on the annotation: -- `FIXME` → also add `task:bug-fix` -- `TODO` → also add `task:implementation` -- `HACK` / `XXX` / `WORKAROUND` → also add `task:refactor` -- ROADMAP items → also add `task:implementation` - -Prioritize: FIXME and HACK annotations over TODO annotations when the cap is reached. - ---- - -## Phase B: Stale/Solved Issue Detection (from repo-hygiene) - -### 9. Detect Stale Issues - -From the open issues fetched in step 4, filter for issues where `updatedAt` is more than 30 days ago. Exclude issues with labels `pinned`, `epic`, or `release-candidate`. - -For each stale issue, check if it has been addressed by a recent merged PR: - -```bash -gh pr list --state merged --search "<issue-title-keywords>" --json number,title,mergedAt --limit 5 -``` - -Categorize stale issues: -- **Likely solved**: a merged PR matches the issue title/keywords -- **Abandoned**: no matching PR and no recent comments -- **Blocked**: has a "blocked" label or dependency comment - -### 10. Detect Solved-But-Open Issues - -Check open issues against recently merged PRs to find issues that were fixed but never closed: - -For each merged PR (from step 4), extract referenced issue numbers from: -- PR body (patterns: `fixes #N`, `closes #N`, `resolves #N`) -- PR title (pattern: `issue-N`, `#N`) - -Check if those referenced issues are still open: - -```bash -gh issue view <N> --json state --jq '.state' -``` - -### 11. Check File Path Validity - -For each open issue, check if referenced file paths still exist: - -- **File paths referenced in the issue that no longer exist** — check with `test -f` -- **Issues that reference `src/agentguard/`** — this directory was removed in a restructure - ---- - -## Phase C: Stale PR and Branch Management (from stale-branch-janitor) - -### 12. Identify Stale PRs - -From the open PRs fetched in step 4, filter for PRs where `updatedAt` is more than 7 days ago. - -Exclude: -- PRs targeting `main` or `master` as the head branch -- PRs with any `source:` label from other scheduled agents -- PRs with a `do-not-close` label - -### 13. Auto-Close Previously Warned PRs (max 3) - -From stale PRs, identify those already labeled `stale`: - -For each (up to 3): - -1. Check for new activity since the stale warning: - -```bash -gh pr view <PR_NUMBER> --json comments,reviews,commits --jq '{lastComment: .comments[-1].createdAt, lastReview: .reviews[-1].submittedAt}' -``` - -2. If **new activity exists**: remove the `stale` label and skip. - -```bash -gh pr edit <PR_NUMBER> --remove-label "stale" -``` - -3. If **no new activity**: close the PR with a comment: - -```bash -gh pr comment <PR_NUMBER> --body "Closing this PR due to 7+ days of inactivity after a stale warning. If this work is still needed, feel free to reopen. - -*Auto-closed by Repository Maintenance*" - -gh pr close <PR_NUMBER> -``` - -### 14. Warn Newly Stale PRs (max 5) - -From stale PRs not yet labeled `stale`: - -For each (up to 5): - -```bash -gh pr comment <PR_NUMBER> --body "This PR has had no activity for 7+ days. It will be automatically closed on the next maintenance run if no further activity occurs. - -To keep this PR open, push a commit, leave a comment, or add the \`do-not-close\` label. - -*Warning posted by Repository Maintenance*" - -gh pr edit <PR_NUMBER> --add-label "stale" --add-label "source:repo-maintenance" -``` - -### 15. Report Orphaned Stale Branches - -List remote branches with no associated open PR that have had no commits in 7+ days: - -```bash -git fetch --prune origin -git for-each-ref --sort=-committerdate --format='%(refname:short) %(committerdate:iso)' refs/remotes/origin/ | grep -v 'origin/main\|origin/master\|origin/HEAD' -``` - -For each branch, check if it has an associated open PR: - -```bash -gh pr list --head <BRANCH_NAME> --state open --json number --jq 'length' -``` - -Branches with no open PR and last commit older than 7 days are "orphaned stale branches." Report them but **do not delete them**. - ---- - -## Phase D: Report and Publish - -### 16. Generate Consolidated Report - -Compile all findings into a structured report: - -``` -## Repository Maintenance Report - -**Date**: <timestamp> -**Governance escalation**: <NORMAL/ELEVATED/HIGH/LOCKDOWN> - -### Code Annotations - -| File:Line | Type | Text | Status | -|-----------|------|------|--------| -| <file>:<line> | TODO/FIXME/HACK | <text> | New issue / Already tracked | - -- **Annotations found**: N TODO, N FIXME, N HACK -- **Already tracked**: N (matched to existing issues) -- **New issues created**: N - -### ROADMAP Items - -- **Unchecked items**: N -- **Already tracked as issues**: N -- **New issues created**: N - -### Stale Issues (no activity >30 days) - -| # | Title | Last Updated | Status | -|---|-------|-------------|--------| -| <N> | <title> | <date> | Likely solved / Abandoned / Blocked | - -### Solved-But-Open Issues - -| Issue # | Title | Fixing PR | Merged | -|---------|-------|-----------|--------| -| <N> | <title> | #<PR> | <date> | - -### Stale PRs - -- **Warned (newly stale)**: N -- **Auto-closed**: N -- **Revived (new activity)**: N - -### Orphaned Branches - -| Branch | Last Commit | Age | -|--------|-------------|-----| -| <name> | <date> | <N> days | - -### Recommendations - -<Actionable suggestions prioritized by governance risk level> -``` - -### 17. Create or Update Maintenance Issue - -Check for an existing maintenance issue: - -```bash -gh issue list --state open --label "source:repo-maintenance" --json number,title --limit 1 -``` - -If an existing issue is open, comment with the new report: - -```bash -gh issue comment <ISSUE_NUMBER> --body "<maintenance report>" -``` - -If no existing issue and there are actionable findings: - -```bash -gh issue create \ - --title "repo-maintenance: <N> findings — $(date +%Y-%m-%d)" \ - --body "<full maintenance report>" \ - --label "source:repo-maintenance" --label "<%= labels.medium %>" -``` - -If no actionable findings, report "Repository maintenance nominal" and STOP — do not create an issue. - -### 18. Summary - -Report: -- **Annotations found**: N (N new issues created) -- **Stale issues**: N (N likely solved, N abandoned) -- **Solved-but-open**: N issues -- **Stale PRs warned**: N -- **Stale PRs closed**: N -- **Orphaned branches**: N -- **Governance context**: escalation level, risk score -- **Issue**: created/updated/none needed -- If clean: "Repository maintenance nominal — no action needed" - -## Rules - -- **Create a maximum of 5 new backlog issues per run** -- **Warn a maximum of 5 stale PRs per run** -- **Auto-close a maximum of 3 previously warned PRs per run** -- **Never delete branches** — only close PRs. Branch cleanup is left to the developer. -- **Never close issues** — only report findings and create/comment on maintenance tracking issues -- **Never close PRs on `main` or `master`** -- **Never close PRs from other scheduled agents** — skip any PR with a `source:` label from another agent -- **Respect `do-not-close` label** — never warn or close a PR with this label -- **Never modify source code** — only read and analyze -- **Never create duplicate issues** — always deduplicate against open issues and ROADMAP -- Do not scan `node_modules/`, `dist/`, or `.git/` directories -- Cap annotation scanning at 50 results -- Stale issue threshold is 30 days, stale PR threshold is 7 days -- If `gh` CLI is not authenticated, still generate the report to console but skip issue/PR operations -- Check for activity before closing stale PRs — remove `stale` label if revived diff --git a/packages/swarm/templates/skills/resolve-merge-conflicts.md b/packages/swarm/templates/skills/resolve-merge-conflicts.md deleted file mode 100644 index 471f03c4..00000000 --- a/packages/swarm/templates/skills/resolve-merge-conflicts.md +++ /dev/null @@ -1,279 +0,0 @@ -# Skill: Resolve Merge Conflicts - -Detect open PR branches with merge conflicts against main, rebase them, and auto-resolve trivial conflicts. For complex conflicts, post a diagnostic comment listing the conflicting files and ask the human to intervene. Designed for periodic scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If a conflict's classification is ambiguous, treat it as **complex** and abort the rebase for that PR -- If governance activation fails, log the failure and **STOP** — do not ask what to do -- If `gh` CLI fails, log the error and **STOP** — do not ask for credentials -- If `git rebase` enters an unexpected state, run `git rebase --abort` and skip that PR -- Default to the **safest option** in every ambiguous situation (abort rebase > attempt resolution) -- When in doubt about any decision, choose the conservative path and document why in the summary - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. Requires `gh` CLI authenticated with repo access. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 1b. Check System Mode - -```bash -cat <%= paths.swarmState %> 2>/dev/null | grep -o '"mode":"[^"]*"' 2>/dev/null -``` - -- If mode is `safe`: output "System in SAFE MODE — skipping merge conflict resolution" and **STOP immediately** -- If mode is `conservative`: proceed normally (conflict resolution is safe even in conservative mode) - -### 2. Ensure Labels Exist - -```bash -gh label create "source:conflict-resolver" --color "C5DEF5" --description "Auto-created by Conflict Resolution skill" 2>/dev/null || true -gh label create "conflict:needs-human" --color "D93F0B" --description "Merge conflict requires manual resolution" 2>/dev/null || true -``` - -### 3. Save Current Branch - -Record the current branch so we can return to it later: - -```bash -git branch --show-current -``` - -### 4. List PRs with Merge Conflicts - -```bash -gh pr list --state open --json number,title,headRefName,mergeable,labels,author --limit 20 -``` - -Filter results: -- **Include**: PRs where `mergeable` is `CONFLICTING` -- **Exclude**: PRs targeting `main` or `master` as the head branch -- **Exclude**: PRs with a `do-not-rebase` label -- **Exclude**: PRs already labeled `conflict:needs-human` (already diagnosed — waiting on human) - -Select only **1 conflicting PR** for this run (the OLDEST by creation date). Processing one at a time prevents cascade conflicts where rebasing PR A invalidates PR B. - -If no conflicting PRs found, report "No merge conflicts to resolve" and STOP. - -### 5. Process Each Conflicting PR - -For each selected PR: - -#### 5a. Fetch Latest State - -```bash -git fetch origin main -git fetch origin <HEAD_BRANCH> -``` - -#### 5b. Check Out the PR Branch - -```bash -git checkout <HEAD_BRANCH> -git reset --hard origin/<HEAD_BRANCH> -``` - -#### 5c. Attempt Rebase onto Main - -```bash -git rebase origin/main -``` - -If the rebase completes with **no conflicts**, skip to step 6. - -If the rebase **hits conflicts**, proceed to step 5d. - -#### 5d. Classify Conflicts - -List conflicting files: - -```bash -git diff --name-only --diff-filter=U -``` - -For each conflicting file, read the conflict markers and classify: - -**Trivial conflicts** (auto-resolve): -- **Import ordering**: both sides added different imports → accept both, sort alphabetically -- **Whitespace / formatting only**: indentation or line ending differences → accept the PR branch version (`git checkout --theirs <file>`) -- **Non-overlapping additions**: both sides added different lines in the same region but not the same lines → accept both additions -- **package.json version bumps**: version field changed by both sides → accept the higher version number -- **Trailing comma or semicolon differences**: formatting-only → accept PR branch version - -**Complex conflicts** (cannot auto-resolve): -- **Overlapping logic changes**: same lines modified with different logic on both sides -- **Structural changes**: function signatures, type definitions, or class structures changed by both sides -- **Test assertion changes**: test expectations modified on both sides -- **Deleted vs modified**: one side deleted code the other side modified -- **Renamed vs modified**: one side renamed a file/function the other side changed - -#### 5e. Resolve or Abort - -**If ALL conflicts in ALL files are trivial:** - -For each conflicting file: -1. Open the file and resolve the conflict markers according to the trivial resolution strategy above -2. Stage the resolved file: `git add <file>` -3. Continue the rebase: `git rebase --continue` -4. If additional conflicts appear, repeat classification and resolution -5. If any conflict during rebase becomes complex, abort: `git rebase --abort` - -**If ANY conflict is complex:** - -Abort the entire rebase: - -```bash -git rebase --abort -``` - -Post a diagnostic comment on the PR and add the `conflict:needs-human` label: - -```bash -gh pr comment <PR_NUMBER> --body "**AgentGuard Conflict Resolution Bot** — manual resolution needed - -## Merge Conflicts Detected - -This branch has conflicts with \`main\` that require manual resolution. - -### Conflicting Files - -| File | Conflict Type | Details | -|------|--------------|---------| -| <file> | <trivial/complex> | <brief description of the conflict> | - -### Suggested Resolution - -\`\`\`bash -git fetch origin main -git checkout <HEAD_BRANCH> -git rebase origin/main -# Resolve conflicts in the files listed above -git add <resolved-files> -git rebase --continue -git push --force-with-lease origin <HEAD_BRANCH> -\`\`\` - ---- -*Automated diagnosis by resolve-merge-conflicts skill on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" - -gh pr edit <PR_NUMBER> --add-label "conflict:needs-human" --add-label "source:conflict-resolver" -``` - -Skip to the next PR. - -### 6. Verify After Successful Rebase - -Run the full quality suite to ensure the rebase didn't break anything: - -```bash -pnpm build && pnpm ts:check && pnpm lint && pnpm format && ppnpm test && pnpm test -``` - -If the suite fails: -1. Attempt auto-fix: `pnpm lint:fix && pnpm format:fix` -2. Re-run: `pnpm build && pnpm ts:check && pnpm lint && pnpm format && ppnpm test && pnpm test` -3. If still failing: the rebase introduced a regression. Reset the branch and post a diagnostic comment: - -```bash -git checkout <HEAD_BRANCH> -git reset --hard origin/<HEAD_BRANCH> -gh pr comment <PR_NUMBER> --body "**AgentGuard Conflict Resolution Bot** — rebase succeeded but quality suite failed - -The branch was successfully rebased onto main, but the full quality suite failed after rebase. This suggests an incompatibility between the PR changes and recent main updates. - -**Failure output**: <relevant error excerpt> - -The branch has been left in its original state. Manual intervention needed to reconcile with main. - ---- -*Automated diagnosis by resolve-merge-conflicts skill on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" - -gh pr edit <PR_NUMBER> --add-label "conflict:needs-human" --add-label "source:conflict-resolver" -``` - -Skip to the next PR. - -### 7. Force Push the Rebased Branch - -Use `--force-with-lease` for safety (prevents overwriting commits pushed by someone else since our fetch): - -```bash -git push --force-with-lease origin <HEAD_BRANCH> -``` - -If force push fails (someone else pushed in the meantime), skip this PR and note it in the summary. - -### 8. Comment on the PR - -```bash -gh pr comment <PR_NUMBER> --body "**AgentGuard Conflict Resolution Bot** — conflicts resolved - -## Rebase Summary - -- **Base**: \`main\` ($(git rev-parse --short origin/main)) -- **Conflicts resolved**: <N> file(s) - -### Resolved Files - -| File | Resolution | -|------|------------| -| <file> | <how it was resolved — e.g., 'Merged non-overlapping additions', 'Accepted PR imports'> | - -### Verification - -Full quality suite passed: build, typecheck, lint, format, ts:test, test - ---- -*Automated fix by resolve-merge-conflicts skill on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" - -gh pr edit <PR_NUMBER> --add-label "source:conflict-resolver" -``` - -If the `conflict:needs-human` label was previously on the PR, remove it: - -```bash -gh pr edit <PR_NUMBER> --remove-label "conflict:needs-human" 2>/dev/null || true -``` - -### 9. Return to Original Branch - -After processing all PRs: - -```bash -git checkout <ORIGINAL_BRANCH> -``` - -### 10. Summary - -Report: -- **PRs with conflicts found**: N -- **Conflicts auto-resolved**: N (list PR numbers, file counts, and resolution types) -- **Conflicts requiring human intervention**: N (list PR numbers and conflicting files) -- **Rebase succeeded but suite failed**: N (list PR numbers) -- **Skipped (do-not-rebase label)**: N -- **Skipped (already diagnosed)**: N -- **Skipped (force push rejected)**: N -- If clean: "No merge conflicts found — all PRs are mergeable" - -## Rules - -- Resolve a maximum of **1 PR per run** (serialized to prevent cascade conflicts) -- **Only use `--force-with-lease`** — never use `--force` (force-with-lease prevents overwriting concurrent pushes) -- **Never rebase `main` or `master`** — only PR branches -- **If ANY conflict in a PR is complex, abort the ENTIRE rebase for that PR** — do not partially resolve. Either all conflicts are trivially resolvable or none are attempted. -- **Run full quality suite after rebase** — if it fails, reset the branch to its original state -- **Never modify protected files during conflict resolution**: `<%= paths.policy %>`, `.claude/settings.json` -- **Respect `do-not-rebase` label** — never rebase PRs with this label -- **Skip PRs already labeled `conflict:needs-human`** — these have already been diagnosed and are waiting on a human -- If `gh` CLI is not authenticated, report the error and STOP -- Always return to the original branch after processing, even if errors occur diff --git a/packages/swarm/templates/skills/respond-to-pr-reviews.md b/packages/swarm/templates/skills/respond-to-pr-reviews.md deleted file mode 100644 index 4651852c..00000000 --- a/packages/swarm/templates/skills/respond-to-pr-reviews.md +++ /dev/null @@ -1,247 +0,0 @@ -# Skill: Respond to PR Reviews - -Detect unresolved review comments on agent-authored PRs, make code changes to address the feedback, validate changes against governance policy, and reply to each thread. Keeps agent PRs moving toward merge without requiring human re-implementation. Designed for periodic scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If a comment's intent is ambiguous, classify it as **non-actionable** and reply acknowledging it -- If a code change is uncertain, **skip it** and reply explaining what was unclear -- If governance activation fails, log the failure and **STOP** — do not ask what to do -- If `gh` CLI fails, log the error and **STOP** — do not ask for credentials -- If a branch has unexpected state, **skip that PR** and move to the next -- Default to the **safest option** in every ambiguous situation (skip > attempt) -- When in doubt about any decision, choose the conservative path and document why in the summary - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. Requires `gh` CLI authenticated with repo access. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Ensure Labels Exist - -```bash -gh label create "source:review-responder" --color "C5DEF5" --description "Auto-created by Review Response skill" 2>/dev/null || true -``` - -### 3. List Agent-Authored Open PRs - -Find open PRs authored by the current authenticated user (the agent): - -```bash -gh pr list --state open --author "@me" --json number,title,headRefName,updatedAt --limit 20 -``` - -If no open agent-authored PRs exist, report "No agent-authored PRs to process" and STOP. - -### 4. Find PRs with Unresolved Review Feedback - -For each PR, check for unresolved review threads: - -```bash -gh pr view <PR_NUMBER> --json reviewThreads --jq '[.reviewThreads[] | select(.isResolved == false)] | length' -``` - -Also check for review comments not yet replied to by this bot: - -```bash -gh api repos/{owner}/{repo}/pulls/<PR_NUMBER>/comments --jq '[.[] | select(.body != null)] | length' -gh pr view <PR_NUMBER> --json comments --jq '[.comments[] | select(.body | contains("AgentGuard Review Response Bot"))] | length' -``` - -Skip PRs with zero unresolved threads and zero unaddressed comments. Select up to **3 PRs** with actionable feedback for this run. - -### 5. Classify Each Comment - -For each unresolved review thread, read the comment body and classify: - -**Actionable feedback** (make code changes): -- Requests a code change ("rename this", "extract this", "add a check for...") -- Points out a bug or logic error -- Asks for missing error handling or validation -- Requests additional test coverage -- Flags a convention violation with a specific fix -- Suggests a refactor with clear direction - -**Non-actionable** (acknowledge but skip code changes): -- General questions ("why did you do this?", "what does this do?") -- Praise or approval ("looks good", "nice") -- Discussion or debate without a clear requested change -- Requests that require product/architecture decisions outside agent scope -- Comments already replied to by `**AgentGuard Review Response Bot**` - -For non-actionable comments, reply with a brief acknowledgment but do NOT make code changes. - -### 6. Check Out the Branch - -For each PR with actionable feedback: - -```bash -git fetch origin <HEAD_BRANCH> -git checkout <HEAD_BRANCH> -git pull origin <HEAD_BRANCH> -``` - -### 7. Address Each Actionable Comment - -For each actionable review comment: - -#### 7a. Read the Referenced Code - -Read the file and lines referenced in the review comment. Understand the surrounding context. - -#### 7b. Apply the Requested Change - -Make the code change that addresses the feedback. Follow project conventions: -- `camelCase` for functions/variables, `UPPER_SNAKE_CASE` for constants -- `const`/`let` only (no `var`), arrow functions preferred -- `import type` for type-only imports -- Single quotes, trailing commas (es5), semicolons - -#### 7c. Validate Against Governance Policy - -After making the change, simulate each modified file against governance policy: - -```bash -<%= paths.cli %> simulate --action file.write --target <modified-file> --policy <%= paths.policy %> --json 2>/dev/null -``` - -If simulation shows a denial: -- Do NOT commit the change -- Reply to the review comment explaining the governance constraint -- Note which policy rule or invariant blocked the change - -If the simulate command is not available, skip validation and proceed. - -#### 7d. Verify the Change - -After each change, run the full quality suite: - -```bash -pnpm build && pnpm ts:check && pnpm lint && pnpm format && ppnpm test && pnpm test -``` - -If the suite fails after the change: -1. Attempt auto-fix: `pnpm lint:fix && pnpm format:fix` -2. Re-run the suite -3. If still failing: **revert the change** for this comment, note it as unresolvable, and move to the next comment - -### 8. Commit and Push - -Stage only the files changed to address review feedback: - -```bash -git add <changed-files> -git commit -m "fix(review): address review feedback — <brief summary of changes>" -git push origin <HEAD_BRANCH> -``` - -If multiple comments were addressed, list them in the commit body: - -``` -fix(review): address review feedback — <summary> - -- <file>: <what was changed and why> -- <file>: <what was changed and why> -``` - -### 9. Reply to Each Review Thread - -For each addressed comment, reply on the review thread: - -```bash -gh api repos/{owner}/{repo}/pulls/<PR_NUMBER>/comments/<COMMENT_ID>/replies \ - -X POST -f body="**AgentGuard Review Response Bot** — feedback addressed - -Applied in commit <SHORT_SHA>: -- <brief description of the change> -- **Governance check**: passed (no policy violations) - ---- -*Automated response by respond-to-pr-reviews skill on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" -``` - -For comments blocked by governance policy: - -```bash -gh api repos/{owner}/{repo}/pulls/<PR_NUMBER>/comments/<COMMENT_ID>/replies \ - -X POST -f body="**AgentGuard Review Response Bot** — blocked by governance policy - -The requested change cannot be applied automatically: -- **Policy rule**: <rule that denied the change> -- **Reason**: <denial reason> -- **Affected file**: <file path> - -This requires a policy review or manual override. Run \`<%= paths.cli %> inspect --last\` for details. - ---- -*Automated response by respond-to-pr-reviews skill on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" -``` - -For comments that could NOT be addressed (suite failure, unclear request, out of scope): - -```bash -gh api repos/{owner}/{repo}/pulls/<PR_NUMBER>/comments/<COMMENT_ID>/replies \ - -X POST -f body="**AgentGuard Review Response Bot** — could not auto-resolve - -**Reason**: <explanation — e.g., 'Change causes test failures in X', 'Request requires architectural decision'> - -Manual intervention needed. Details: -- <specific issue or question> - ---- -*Automated response by respond-to-pr-reviews skill on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" -``` - -For non-actionable comments (questions, discussion): - -```bash -gh api repos/{owner}/{repo}/pulls/<PR_NUMBER>/comments/<COMMENT_ID>/replies \ - -X POST -f body="**AgentGuard Review Response Bot** — acknowledged - -<brief response to the question or discussion point> - ---- -*Automated response by respond-to-pr-reviews skill on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" -``` - -### 10. Return to Original Branch - -```bash -git checkout - -``` - -### 11. Summary - -Report: -- **PRs processed**: N (list PR numbers and titles) -- **Comments addressed (code changed)**: N -- **Comments blocked by governance**: N (list policy rules) -- **Comments acknowledged (non-actionable)**: N -- **Comments unresolvable**: N (list reasons) -- **Commits pushed**: N (list commit SHAs) -- **PRs skipped (no feedback)**: N -- **PRs skipped (cap reached)**: N - -## Rules - -- Process a maximum of **3 PRs per run** -- **Only respond to PRs authored by `@me`** — never modify PRs authored by humans or other agents -- **Never force push** — always regular push -- **Never modify protected files**: `<%= paths.policy %>`, `.claude/settings.json`, files in `packages/kernel/src/`, `packages/policy/src/`, `packages/invariants/src/` unless the review comment explicitly references them AND the linked issue authorizes it -- **Never push if the full quality suite fails** — revert the change and reply explaining the failure -- **Never push changes that governance policy denies** — report the denial in the review thread -- **Skip comments already replied to** by `**AgentGuard Review Response Bot**` -- **Do not approve, merge, or request changes** on PRs — only make code changes and reply to comments -- **Do not address merge/approval requests** — only code change requests -- Only commit files directly related to the review feedback — no unrelated changes -- If `gh` CLI is not authenticated, report the error and STOP -- If a branch has merge conflicts, skip it and note in summary — let `resolve-merge-conflicts` handle it diff --git a/packages/swarm/templates/skills/retrospective.md b/packages/swarm/templates/skills/retrospective.md deleted file mode 100644 index a3f691c9..00000000 --- a/packages/swarm/templates/skills/retrospective.md +++ /dev/null @@ -1,247 +0,0 @@ -# Skill: Retrospective - -Analyze patterns in failed PRs, CI regressions, review feedback, merge conflicts, and rollbacks to extract actionable heuristics. Publish a retrospective report with lessons learned and recommendations for improving swarm effectiveness. Designed for weekly scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Analyze Failed PRs (Last 14 Days) - -```bash -gh pr list --state closed --limit 30 --json number,title,mergedAt,closedAt,headRefName,labels,body,comments -``` - -Identify PRs that were **closed without merge** (`mergedAt` is null): -- Extract the PR title and branch name -- Read PR comments for failure reasons (review rejections, CI failures, conflicts) -- Categorize the failure: - - **Review rejected**: Code quality, architecture, style issues - - **CI failed**: Test failures, lint errors, type errors, build failures - - **Conflict abandoned**: Merge conflicts that were never resolved - - **Superseded**: Replaced by another PR - - **Stale**: No activity for extended period - -Count: -- Total closed-without-merge PRs -- Failure category distribution - -### 3. Analyze Review Feedback Patterns - -```bash -gh pr list --state all --limit 30 --json number,title,reviews,reviewDecision -``` - -For PRs with reviews, analyze: -- **Common review themes**: What do reviewers flag most? (missing tests, style issues, blast radius, missing docs) -- **Review-to-merge time**: How long from first review to merge? -- **Revision count**: How many review rounds before merge? - -Look for patterns: -- Same reviewer comment appearing 3+ times across PRs → systemic issue -- PRs requiring 3+ revision rounds → agent not learning from feedback -- PRs with `CHANGES_REQUESTED` that were eventually merged → what changed? - -### 4. Analyze CI Failure Patterns - -```bash -gh run list --limit 50 --json databaseId,conclusion,headBranch,createdAt,name -``` - -For failed runs, identify patterns: -- **Failure hotspots**: Which workflow jobs fail most? (lint, typecheck, test, build) -- **Branch patterns**: Do certain branches fail more? -- **Flaky tests**: Same branch with both pass and fail (non-deterministic) -- **Regression patterns**: A test that was passing starts failing across multiple branches - -For the top 3 most recent failures, get details: -```bash -gh run view <RUN_ID> --json jobs --jq '.jobs[] | select(.conclusion == "failure") | {name, steps: [.steps[] | select(.conclusion == "failure") | .name]}' -``` - -### 5. Analyze Merge Conflict Patterns - -```bash -gh pr list --state all --limit 50 --json number,title,headRefName,mergeable,labels,changedFiles -``` - -Identify: -- **Conflict hotspot files**: Which files appear in conflicts most often? -- **Conflict recurrence**: Same file conflicting across 3+ PRs -- **Resolution patterns**: Were conflicts resolved by rebase or by closing the PR? - -### 6. Analyze Agent Effectiveness - -Cross-reference agent outputs: - -```bash -# Coder Agent PRs -gh pr list --state all --limit 20 --json number,title,mergedAt,closedAt,additions,deletions,headRefName - -# Issues closed vs created -gh issue list --state closed --limit 30 --json number,title,closedAt,labels -gh issue list --state open --json number,labels --jq length -``` - -Calculate: -- **PR merge rate**: merged / (merged + closed-without-merge) — target >80% -- **Average PR size**: mean of (additions + deletions) — target <300 lines -- **Issue throughput**: issues closed per week -- **Backlog growth**: open issues trend (growing / stable / shrinking) - -### 7. Detect Recurring Patterns - -From the data collected in steps 2-6, identify recurring patterns: - -**Anti-patterns** (things that keep going wrong): -- Same test failing across multiple PRs -- Same review feedback repeated (agent not adapting) -- Same files causing conflicts (need serialized access) -- PRs too large (blast radius repeatedly flagged) -- CI failures on the same step across branches - -**Success patterns** (things that work well): -- PRs that merge on first review (what makes them successful?) -- Phases with high completion velocity -- Agents with high effectiveness rates - -### 8. Generate Heuristics - -From the patterns detected, distill actionable heuristics: - -Format each heuristic as: -``` -HEURISTIC: <short name> -EVIDENCE: <specific data points> -RECOMMENDATION: <what should change> -AFFECTED AGENTS: <which agents should adapt> -PRIORITY: HIGH / MEDIUM / LOW -``` - -Example heuristics: -- "PR size limit" — PRs >300 lines have 60% merge rate vs 90% for smaller PRs → Coder Agent should split large changes -- "Test file co-location" — PRs without test changes get rejected 3x more → Coder Agent should always include tests -- "Conflict hotspot: src/events/schema.ts" — 5 PRs conflicted on this file → serialize work touching event schema - -### 9. Generate Retrospective Report - -Check if a previous retrospective exists: - -```bash -gh issue list --state open --label "source:retrospective-agent" --json number --jq '.[0].number' 2>/dev/null -``` - -If a previous report exists, close it: -```bash -gh issue close <PREV_NUMBER> --comment "Superseded by new retrospective." -``` - -Create the new report: - -```bash -gh issue create \ - --title "Retrospective — $(date +%Y-%m-%d) — Week $(date +%V)" \ - --body "<retrospective markdown>" \ - --label "source:retrospective-agent" --label "<%= labels.pending %>" -``` - -**Report format:** - -```markdown -## Weekly Retrospective - -**Period:** <start date> to <end date> -**Generated:** <timestamp UTC> - -### Velocity Metrics - -| Metric | This Week | Previous Week | Trend | -|--------|-----------|---------------|-------| -| PRs merged | N | N | up/down/stable | -| PRs closed (no merge) | N | N | | -| PR merge rate | N% | N% | | -| Average PR size | N lines | N lines | | -| Issues closed | N | N | | -| Issues created | N | N | | -| Backlog size | N | N | | - -### Failure Analysis - -#### Failed PRs (N total) -| PR | Title | Failure Category | Root Cause | -|----|-------|------------------|------------| -| #N | <title> | <category> | <brief cause> | - -#### CI Failure Hotspots -| Job/Step | Failures (14d) | Pattern | -|----------|---------------|---------| -| <job> | N | <description> | - -#### Merge Conflict Hotspots -| File | Conflicts (14d) | Impact | -|------|----------------|--------| -| <file> | N | <description> | - -### Patterns Detected - -#### Anti-Patterns -1. **<pattern name>** — <evidence> — <impact> -2. ... - -#### Success Patterns -1. **<pattern name>** — <evidence> — <why it works> -2. ... - -### Heuristics - -| # | Heuristic | Evidence | Recommendation | Affected Agents | Priority | -|---|-----------|----------|----------------|-----------------|----------| -| 1 | <name> | <data> | <action> | <agents> | HIGH/MED/LOW | - -### Top 3 Recommendations - -1. **<most impactful recommendation>** — <brief reasoning> -2. **<second recommendation>** -3. **<third recommendation>** -``` - -### 10. Summary - -Report: -- **Period analyzed**: 14 days -- **PRs analyzed**: N (N merged, N failed) -- **PR merge rate**: N% -- **CI failure rate**: N% -- **Anti-patterns detected**: N -- **Heuristics generated**: N -- **Top recommendation**: Brief statement -- **Retrospective created**: #N - -## Rules - -- Create a maximum of **1 retrospective report per run** -- **NEVER modify source code or tests** — only report findings -- **NEVER close issues** — only close previous retrospective reports labeled `source:retrospective-agent` -- **NEVER create work issues** — recommendations are for other agents and humans to act on -- If `gh` CLI is not authenticated, report the error and STOP -- Analysis should cover the **last 14 days** to capture enough data for pattern detection -- Heuristics should be backed by specific data points — never speculate without evidence -- Limit to **top 5 heuristics** per report (prioritize by impact) -- When calculating merge rate, exclude draft PRs and PRs with `do-not-merge` label -- The retrospective agent is read-only on the codebase — it never modifies files diff --git a/packages/swarm/templates/skills/review-open-prs.md b/packages/swarm/templates/skills/review-open-prs.md deleted file mode 100644 index 07476adf..00000000 --- a/packages/swarm/templates/skills/review-open-prs.md +++ /dev/null @@ -1,164 +0,0 @@ -# Skill: Review Open PRs - -Review open pull requests for code quality, coding convention adherence, governance compliance, and test coverage. Posts structured review comments. Designed for periodic scheduled execution. - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. List Open PRs - -```bash -gh pr list --state open --json number,title,author,headRefName,additions,deletions,createdAt --limit 10 -``` - -If no open PRs exist, report "No open PRs to review" and STOP. - -### 3. Filter Unreviewed PRs - -For each open PR, check if this skill has already posted a review: - -```bash -gh pr view <PR_NUMBER> --json comments --jq '.comments[].body' | grep -c "AgentGuard Review Bot" || echo 0 -``` - -Skip PRs that already have an `**AgentGuard Review Bot**` comment. Select up to **3 unreviewed PRs** for this run. - -### 4. Review Each PR - -For each selected PR: - -#### 3a. Read the Diff - -```bash -gh pr diff <PR_NUMBER> -``` - -#### 3b. Read the PR Body - -```bash -gh pr view <PR_NUMBER> --json body --jq '.body' -``` - -#### 3c. Check Changed Files - -```bash -gh pr view <PR_NUMBER> --json files --jq '.files[].path' -``` - -#### 3d. Read Linked Issue - -If the PR body or title references an issue (patterns: `fixes #N`, `closes #N`, `resolves #N`, `issue-N`, `#N`): - -```bash -gh issue view <N> --json title,body -``` - -Extract the issue's acceptance criteria (look for checklist items: `- [ ]` or `- [x]`). - -#### 3e. Evaluate Quality - -Check the diff against these criteria: - -**Semantic Review (if linked issue found):** -- Does the implementation address the issue's acceptance criteria? -- Are all acceptance criteria covered by the diff? -- Does the PR introduce scope creep (changes not related to the issue)? - -**Coding Conventions:** -- Uses `camelCase` for functions/variables, `UPPER_SNAKE_CASE` for constants -- Uses `const`/`let` only (no `var`) -- Uses arrow functions -- Uses `import type` for type-only imports -- Single quotes, trailing commas (es5), semicolons - -**Architecture Boundaries:** -- Files in `packages/kernel/src/**`, `packages/policy/src/**`, `packages/invariants/src/**` should only be modified if the linked issue explicitly authorizes it -- `<%= paths.policy %>` and `.claude/settings.json` should not be modified -- Cross-layer imports follow dependency rules (adapters should not import from cli, kernel should not import from adapters) -- Module boundaries respected: each workspace package (kernel, events, policy, invariants, adapters, cli, core) is a distinct layer - -**Test Coverage:** -- New source files in `packages/*/src/` or `apps/*/src/` should have corresponding test files -- Bug fixes should include regression tests - -**Governance Compliance:** -- PR body should contain a `## Governance Report` section (for agent-created PRs) -- PR body should contain a `## Test Plan` section - -**Size & Complexity:** -- PRs with >500 lines changed should be flagged for potential splitting -- PRs touching >10 files should be flagged for scope assessment -- Single-purpose PRs preferred over multi-concern bundles - -**Merge Readiness:** -- CI status: check if latest commit has passing checks -- Review comments: check if any unresolved review threads exist - -```bash -gh pr checks <PR_NUMBER> --json name,state --jq '[.[] | select(.state != "SUCCESS")] | length' -gh pr view <PR_NUMBER> --json reviewDecision,reviews -``` - -**General Quality:** -- No debug logging left in (`console.log`, `debugger`) -- No commented-out code blocks -- No hardcoded secrets or credentials -- Imports are clean (no unused imports) - -### 5. Post Review Comment - -For each reviewed PR, post a structured comment: - -```bash -gh pr comment <PR_NUMBER> --body "**AgentGuard Review Bot** — automated code review - -## Summary - -<1-2 sentence overall assessment> - -## Findings - -| Category | Status | Details | -|----------|--------|---------| -| Semantic alignment | <PASS/WARN/FAIL/N/A> | <acceptance criteria coverage> | -| Coding conventions | <PASS/WARN/FAIL> | <brief details> | -| Architecture boundaries | <PASS/WARN/FAIL> | <brief details> | -| Test coverage | <PASS/WARN/FAIL> | <brief details> | -| Governance compliance | <PASS/WARN/FAIL> | <brief details> | -| Size & complexity | <PASS/WARN/FAIL> | <lines changed, files touched> | -| Merge readiness | <PASS/WARN/FAIL> | <CI status, unresolved comments> | -| General quality | <PASS/WARN/FAIL> | <brief details> | - -## Specific Items - -<Numbered list of specific findings with file:line references> - ---- -*Automated review by review-open-prs skill on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" -``` - -### 6. Summary - -Report: -- **PRs reviewed**: N (list PR numbers and titles) -- **PRs skipped (already reviewed)**: N -- **PRs skipped (cap reached)**: N -- **Overall findings**: N PASS, N WARN, N FAIL across all reviews - -## Rules - -- Review a maximum of **3 PRs per run** -- **Never approve or merge PRs** — post informational comments only -- **Never use `gh pr review --request-changes`** — only use `gh pr comment` -- **Never modify PR code** — review is read-only -- Skip PRs that already have an `**AgentGuard Review Bot**` comment -- If a PR has no diff (empty), skip it -- Be constructive — flag issues but do not use harsh language -- If `gh` CLI is not authenticated, report the error and STOP diff --git a/packages/swarm/templates/skills/risk-escalation.md b/packages/swarm/templates/skills/risk-escalation.md deleted file mode 100644 index fb39070b..00000000 --- a/packages/swarm/templates/skills/risk-escalation.md +++ /dev/null @@ -1,275 +0,0 @@ -# Skill: Risk & Escalation - -Assess cumulative swarm risk across multiple dimensions, gate dangerous operations, and escalate to human notification when autonomy should be reduced. This agent is the circuit breaker — it decides when the swarm should slow down or stop. Designed for periodic scheduled execution (every 4 hours). - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** — when in doubt, escalate rather than ignore - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Collect Risk Signals - -Gather data from multiple sources to score risk: - -#### 2a. PR Blast Radius - -```bash -gh pr list --state open --json number,title,additions,deletions,changedFiles --limit 20 -``` - -For each open PR, compute: -- `blast_radius = additions + deletions` -- `file_count = changedFiles` - -Flag: -- Any single PR with `blast_radius > 500` lines — HIGH risk -- Any single PR with `file_count > 20` files — HIGH risk -- Total open PR blast radius > 2000 lines — ELEVATED risk - -#### 2b. Test Failure Rate - -```bash -gh run list --limit 20 --json databaseId,conclusion,createdAt,headBranch -``` - -Calculate: -- `failure_rate = failed_runs / total_runs` (last 20 runs) -- `main_failures = failed_runs on main branch` (last 5 runs on main) - -Flag: -- `failure_rate > 0.40` (40%+) — HIGH risk -- `main_failures >= 2` — CRITICAL risk (main is unreliable) -- `failure_rate > 0.20` (20%+) — ELEVATED risk - -#### 2c. Governance Denial Rate - -```bash -<%= paths.cli %> analytics --format json 2>/dev/null | head -100 -``` - -If analytics available, extract: -- `denial_rate` (last 24h) -- `invariant_violation_rate` (last 24h) -- `escalation_level` (current) - -If analytics not available, check telemetry: -```bash -cat <%= paths.logs %> 2>/dev/null | tail -200 | grep -c '"policy_result":"deny"' 2>/dev/null -cat <%= paths.logs %> 2>/dev/null | tail -200 | wc -l 2>/dev/null -``` - -Flag: -- `denial_rate > 0.25` — HIGH risk -- `invariant_violation_rate > 0.10` — HIGH risk -- `escalation_level` is HIGH or LOCKDOWN — CRITICAL risk - -#### 2d. Merge Conflict Rate - -```bash -gh pr list --state open --json number,mergeable --jq '[.[] | select(.mergeable == "CONFLICTING")] | length' -gh pr list --state open --json number --jq length -``` - -Calculate: -- `conflict_rate = conflicting_prs / total_open_prs` - -Flag: -- `conflict_rate > 0.50` (50%+) — HIGH risk (half the queue is broken) -- `conflict_rate > 0.25` (25%+) — ELEVATED risk - -#### 2e. Agent Churn Rate - -```bash -# PRs opened in last 24h -gh pr list --state all --json number,createdAt --limit 50 --jq '[.[] | select(.createdAt > "'$(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || date -u -v-24H +%Y-%m-%dT%H:%M:%SZ)'")]' 2>/dev/null - -# PRs closed without merge in last 24h -gh pr list --state closed --json number,mergedAt,closedAt --limit 20 --jq '[.[] | select(.mergedAt == null)]' 2>/dev/null -``` - -Flag: -- More than 5 PRs opened in 24h — ELEVATED risk (agent hyperactivity) -- More than 3 PRs closed-without-merge in 24h — HIGH risk (wasted work) - -### 3. Compute Composite Risk Score - -Score each dimension (0-25 scale, total max 100): - -| Dimension | HEALTHY (0) | ELEVATED (10) | HIGH (20) | CRITICAL (25) | -|-----------|-------------|---------------|-----------|---------------| -| PR Blast Radius | Total <500 | Total 500-2000 | Any PR >500 lines | Multiple PRs >500 | -| Test Failures | <20% fail | 20-40% fail | 40%+ fail | Main broken | -| Governance Denials | <10% deny | 10-25% deny | 25%+ deny | LOCKDOWN | -| Conflicts | <25% conflict | 25-50% conflict | 50%+ conflict | All PRs conflict | - -`composite_risk = blast_risk + test_risk + governance_risk + conflict_risk` - -### 4. Determine Escalation Level - -| Composite Risk | Escalation | -|----------------|------------| -| 0-20 | NORMAL — full autonomy | -| 21-40 | ELEVATED — log warning, continue | -| 41-60 | HIGH — reduce autonomy, notify via issue | -| 61-100 | CRITICAL — recommend safe mode, create alert | - -### 5. Gate Dangerous Operations - -Check for pending operations that should be blocked at current risk level: - -#### At ELEVATED (risk > 20): -- Flag any open PR with `blast_radius > 300` as needing extra review: - ```bash - gh pr edit <NUMBER> --add-label "needs:careful-review" - ``` - Cap at **3 label additions per run**. - -#### At HIGH (risk > 40): -- Add `do-not-merge` label to any PR with `blast_radius > 500`: - ```bash - gh pr edit <NUMBER> --add-label "do-not-merge" - gh pr comment <NUMBER> --body "Risk & Escalation Agent: Adding do-not-merge — composite risk score is HIGH ($(risk_score)/100). This PR has a blast radius of $(blast_radius) lines. Manual review recommended before merging." - ``` - Cap at **2 gate actions per run**. - -#### At CRITICAL (risk > 60): -- Recommend safe mode in swarm-state.json (Recovery Controller is the authority for mode changes, so this agent only recommends) -- Create a critical alert issue - -### 6. Human Escalation - -At HIGH or CRITICAL risk levels, create a human-readable escalation issue: - -Check for existing escalation: -```bash -gh issue list --state open --label "source:risk-escalation" --label "<%= labels.critical %>" --json number --jq '.[0].number' 2>/dev/null -``` - -If no existing critical escalation: - -```bash -gh issue create \ - --title "RISK ALERT: Composite risk $(risk_score)/100 — $(escalation_level) — $(date +%Y-%m-%d)" \ - --body "## Risk Escalation Alert - -**Composite Risk Score:** $(risk_score)/100 -**Escalation Level:** $(escalation_level) -**Timestamp:** $(date -u +%Y-%m-%dT%H:%M:%SZ) - -### Risk Breakdown - -| Dimension | Score | Evidence | -|-----------|-------|----------| -| PR Blast Radius | /25 | Total: N lines, largest PR: #N (N lines) | -| Test Failures | /25 | N% failure rate, main: pass/fail | -| Governance Denials | /25 | N% denial rate, escalation: LEVEL | -| Merge Conflicts | /25 | N/N PRs conflicting (N%) | - -### Recommended Actions - -1. <specific action based on highest-risk dimension> -2. <second action> -3. <third action> - -### Gating Actions Taken - -| Action | Target | Reason | -|--------|--------|--------| -| <label added / merge blocked> | PR #N | <reason> | - -### What This Means - -<1-2 sentence plain-language explanation of what's going wrong and what will happen if not addressed> - ---- -*Auto-created by risk-escalation agent. This alert requires human attention.*" \ - --label "source:risk-escalation" --label "<%= labels.critical %>" --label "<%= labels.pending %>" -``` - -### 7. Update Swarm State - -Read and update swarm-state.json: - -```bash -cat <%= paths.swarmState %> 2>/dev/null || echo '{}' -``` - -Update: -- `riskScore`: composite risk score (0-100) -- `riskLevel`: `normal` | `elevated` | `high` | `critical` -- `riskBreakdown`: object with per-dimension scores -- `lastRiskAssessment`: current ISO timestamp -- `recommendedMode`: if CRITICAL, set to `safe`; if HIGH, set to `conservative`; else preserve existing - -Preserve all other fields. - -### 8. Route Risk Report (Report Routing Protocol) - -The escalation alert (Step 6) already follows ALERT tier — it only fires at HIGH/CRITICAL. - -**For the routine risk report → REPORT tier** (write to local file): - -```bash -mkdir -p .agentguard/reports -cat > .agentguard/reports/risk-escalation-$(date +%Y-%m-%d).md <<'REPORT_EOF' -<risk report markdown with score, breakdown, and recommendations> -REPORT_EOF -``` - -Close any previous routine risk report issues that are still open (never close alert issues): - -```bash -PREV=$(gh issue list --state open --label "source:risk-escalation" --json number,labels --jq '[.[] | select(.labels | map(.name) | index("<%= labels.critical %>") | not)] | .[].number' 2>/dev/null) -for num in $PREV; do - gh issue close "$num" --comment "Superseded — routine risk reports now written to .agentguard/reports/" 2>/dev/null || true -done -``` - -**If risk is NORMAL and no gating actions taken → LOG tier**: - -```bash -mkdir -p .agentguard/logs -echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) [risk-escalation] Risk: $(risk_score)/100 (NORMAL). No actions taken." >> .agentguard/logs/swarm.log -``` - -### 9. Summary - -Report: -- **Composite risk score**: N/100 (NORMAL/ELEVATED/HIGH/CRITICAL) -- **Highest risk dimension**: <dimension> at N/25 -- **Gating actions taken**: N -- **Escalation issued**: Yes/No -- **Recommended mode**: normal/conservative/safe -- **Report created**: #N -- **Alert created**: #N or none - -## Rules - -- **Routine risk reports go to `.agentguard/reports/`, NOT GitHub issues** — follow the report-routing protocol -- Create a maximum of **1 escalation alert per run** — only at HIGH/CRITICAL risk levels -- Apply a maximum of **3 labels per run** (needs:careful-review) -- Apply a maximum of **2 gate actions per run** (do-not-merge) -- **NEVER merge or close PRs** — only label them and comment -- **NEVER modify source code** -- **NEVER set the mode field in swarm-state.json** — only set `recommendedMode` (Recovery Controller owns `mode`) -- **NEVER create duplicate escalation alerts** — check for existing ones first -- If `gh` CLI is not authenticated, report the error and STOP -- When computing risk, use actual data — never estimate or assume -- Risk scores should be conservative — round up when data is ambiguous -- The `do-not-merge` label is a strong signal — only apply at HIGH risk or above -- Close previous routine reports but NEVER close previous escalation alerts (those need human acknowledgment) diff --git a/packages/swarm/templates/skills/roadmap-expand.md b/packages/swarm/templates/skills/roadmap-expand.md deleted file mode 100644 index 17b22ad7..00000000 --- a/packages/swarm/templates/skills/roadmap-expand.md +++ /dev/null @@ -1,156 +0,0 @@ -# Skill: Roadmap Expand - -Parse the project `roadmap.md` file and expand active milestones into GitHub issues with labels, dependencies, and enforced issue format. Deduplicates against existing open issues. Designed for manual invocation when the maintainer updates the roadmap. - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. Requires `gh` CLI authenticated. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active. If governance cannot be activated, STOP. - -### 2. Read Roadmap - -```bash -cat roadmap.md -``` - -If `roadmap.md` does not exist, report "No roadmap.md found — nothing to expand" and STOP. - -### 3. Parse Active Phases - -Parse the roadmap using this expected format: - -```markdown -## Phase: <phase-name> -Status: active | planned | complete -Target: <optional date or milestone> - -### Feature: <feature-name> -Priority: critical | high | medium | low -Dependencies: [<other-feature-name>, ...] -Acceptance: - - <concrete acceptance criterion> -Labels: [<github-label>, ...] -Role: developer | qa | architect | documentation -Notes: <optional context> -``` - -Only process phases with `Status: active`. Skip `planned` and `complete` phases. - -For each feature in active phases, extract: -- Feature name (becomes issue title) -- Priority (becomes label) -- Dependencies (becomes "blocked by" references) -- Acceptance criteria (becomes issue body checklist) -- Labels (propagated directly) -- Role (becomes role label for agent discovery) -- Notes (becomes context section in issue body) - -### 4. Fetch Existing Issues (Comprehensive Dedup) - -Retrieve all open issues regardless of source agent: - -```bash -gh issue list --state open --json number,title,labels --limit 200 -``` - -Also retrieve recently closed issues (last 30 days) to avoid re-filing resolved work: - -```bash -gh issue list --state closed --limit 100 --json number,title,labels,closedAt -``` - -Build a lookup map of existing issue titles to avoid creating duplicates. Include issues from ALL `source:*` labels (not just `source:roadmap-agent`) — issues created by `source:backlog-steward`, `source:planning-agent`, `source:test-agent`, or any other agent count as existing coverage. - -**Matching rules** (match on ANY signal → skip): -- **Substring match**: Feature name appears as substring in issue title (case-insensitive) -- **Keyword overlap**: Extract 3-5 key terms from feature name; if ≥60% appear in an issue title, it is a match -- **Closed recency**: If a matching issue was closed in the last 30 days, treat it as covered — do NOT re-create - -### 5. Ensure Labels Exist - -For each unique label referenced in the roadmap: - -```bash -gh label create "<label>" --color "EDEDED" --description "Auto-created by roadmap-expand" 2>/dev/null || true -``` - -Also ensure standard labels exist: - -```bash -gh label create "source:roadmap-agent" --color "1D76DB" --description "Auto-created by Roadmap Expansion Agent" 2>/dev/null || true -gh label create "<%= labels.critical %>" --color "B60205" 2>/dev/null || true -gh label create "<%= labels.high %>" --color "D93F0B" 2>/dev/null || true -gh label create "<%= labels.medium %>" --color "FBCA04" 2>/dev/null || true -gh label create "<%= labels.low %>" --color "0E8A16" 2>/dev/null || true -gh label create "<%= labels.developer %>" --color "C5DEF5" 2>/dev/null || true -gh label create "role:qa" --color "C5DEF5" 2>/dev/null || true -gh label create "<%= labels.architect %>" --color "C5DEF5" 2>/dev/null || true -gh label create "role:documentation" --color "C5DEF5" 2>/dev/null || true -``` - -### 6. Create Issues for New Features - -For each feature that does NOT have a matching open issue (by title similarity): - -```bash -gh issue create \ - --title "<feature-name>" \ - --body "$(cat <<'EOF' -## Context - -**Phase**: <phase-name> -**Priority**: <priority> -**Role**: <role> - -## Acceptance Criteria - -- [ ] <criterion 1> -- [ ] <criterion 2> - -## Dependencies - -<list of blocking features/issues, or "None"> - -## Notes - -<notes from roadmap, or "None"> - ---- -*Auto-generated by roadmap-expand skill from roadmap.md* -EOF -)" \ - --label "source:roadmap-agent" --label "priority:<priority>" --label "role:<role>" -``` - -### 7. Link Dependencies - -For features that declare dependencies on other features, add a comment linking to the blocking issue: - -```bash -gh issue comment <ISSUE_NUMBER> --body "Blocked by #<blocking-issue-number> (<feature-name>)" -``` - -### 8. Summary - -Report: -- **Phase(s) processed**: <list of active phases> -- **Features found**: N total -- **Issues created**: N new -- **Issues skipped (already exist)**: N -- **Dependencies linked**: N -- If nothing to do: "Roadmap is up-to-date — no new issues needed" - -## Rules - -- **Only process `Status: active` phases** — never create issues for planned or complete phases. -- **Never create duplicate issues** — check existing open issues by title similarity before creating. -- **Never close or modify existing issues** — only create new ones and add dependency comments. -- **Never modify `roadmap.md`** — this skill is a one-way expansion (roadmap → issues). -- If a feature has no acceptance criteria, still create the issue but add a note: "Acceptance criteria not defined in roadmap — please add before implementation." -- If `gh` CLI is not authenticated, STOP — issue creation requires GitHub access. -- Cap at 20 issues per run to avoid flooding the backlog. diff --git a/packages/swarm/templates/skills/run-tests.md b/packages/swarm/templates/skills/run-tests.md deleted file mode 100644 index 4ef040e9..00000000 --- a/packages/swarm/templates/skills/run-tests.md +++ /dev/null @@ -1,89 +0,0 @@ -# Skill: Run Tests - -Run the complete build, test, and verification suite. Every step must pass before creating a pull request. - -## Prerequisites - -Run `implement-issue` first. Changes must be committed. - -## Steps - -Run these in sequence. If any step fails, fix and retry before proceeding. - -### 1. Build TypeScript - -```bash -pnpm build -``` - -Compiles all workspace packages via Turborepo. If the build fails, read the error output, fix the source files, and rebuild. Do not proceed until the build succeeds. - -### 2. Run TypeScript Tests (vitest) - -```bash -ppnpm test -``` - -Report the pass/fail count. If any tests fail: -- If the failure is in code you modified, fix it and re-run -- If the failure is a pre-existing issue unrelated to your changes, note it but proceed -- Re-run after any fix: `ppnpm test` - -### 3. Run JavaScript Tests - -```bash -pnpm test -``` - -Report the pass/fail count. Same fix-or-note approach as step 2. - -### 4. Run ESLint - -```bash -pnpm lint -``` - -If lint errors exist in files you modified: - -```bash -pnpm lint:fix -pnpm lint -``` - -If errors persist after auto-fix, fix manually. - -### 5. Run Prettier Format Check - -```bash -pnpm format -``` - -If formatting issues exist in files you modified: - -```bash -pnpm format:fix -pnpm format -``` - -### 6. Commit Fixes - -If steps 1-5 required any fixes, stage and commit them: - -```bash -git add <fixed-files> -git commit -m "fix(issue-<N>): address test/lint/format issues" -``` - -### 7. Summary - -Provide a one-line pass/fail summary: - -- **All clear**: "Build OK, Tests: X pass / 0 fail, Lint: clean, Format: clean" -- **Issues found**: "Build: pass/fail | Tests: X pass / Y fail | Lint: N errors | Format: N issues" - -## Rules - -- ALL steps must pass before proceeding to `create-pr` -- If tests fail and you cannot fix them after 2 attempts, STOP and report the failure -- Do not skip any step -- Pre-existing failures unrelated to your changes should be noted but do not block the PR diff --git a/packages/swarm/templates/skills/scheduled-docs-sync.md b/packages/swarm/templates/skills/scheduled-docs-sync.md deleted file mode 100644 index 0873981d..00000000 --- a/packages/swarm/templates/skills/scheduled-docs-sync.md +++ /dev/null @@ -1,172 +0,0 @@ -# Skill: Scheduled Documentation Sync - -Detect documentation drift between the codebase and project documentation files. If drift is found, create a branch with fixes and open a pull request. Designed for periodic scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If data is unavailable or ambiguous, proceed with available data and note limitations -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Check for Existing Docs PR - -Before doing any work, check if an open docs-sync PR already exists: - -```bash -gh pr list --state open --label "source:docs-sync" --json number,title,url -``` - -If an open PR exists, update it instead of creating a new one. Note the PR number for step 7. - -### 3. Read Source of Truth - -Read the files that define the current state of the project: - -```bash -# Project structure -find packages/ apps/ -type f -name "*.ts" -not -path "*/node_modules/*" -not -path "*/dist/*" | sort - -# Action types -grep -c "actionType:" packages/core/src/actions.ts - -# Event kinds -grep "export type EventKind" packages/events/src/schema.ts - -# Invariant definitions -grep "id:" packages/invariants/src/definitions.ts - -# Package scripts (root turbo scripts) -cat package.json | grep -A 50 '"scripts"' - -# CLI commands -grep -r "\.command(" apps/cli/src/ --include="*.ts" | head -20 -``` - -### 4. Read Documentation Files - -Read the documentation files that need to stay in sync: - -- `README.md` — project overview, feature list, quick start -- `ARCHITECTURE.md` — system design, module descriptions, directory layout -- `CLAUDE.md` — AI assistant guide, project structure, commands, conventions -- `<%= paths.roadmap %>` — phase status, feature checklists - -### 5. Detect Drift - -Compare the source-of-truth data against documentation content. Check for: - -- **Stale counts**: action type count, event kind count, invariant count, test file count -- **Outdated structure trees**: directory listings that don't match actual `src/` layout -- **Missing references**: new CLI commands, new event types, or new action types not documented -- **Stale script references**: `package.json` scripts that changed but docs still reference old names -- **Phase status**: ROADMAP items marked `[ ]` that are now implemented in code - -### 6. Fix Drift - -If drift is detected, create a working branch: - -```bash -git checkout -b maintenance/docs-sync-$(date -u +%Y%m%d) -``` - -If the branch already exists (from a previous run that didn't complete): - -```bash -git checkout maintenance/docs-sync-$(date -u +%Y%m%d) -``` - -Apply fixes to the documentation files: -- Update counts to match actual source-of-truth values -- Update directory trees to match actual structure -- Add missing references for new features -- Update script references -- Mark ROADMAP items as `[x]` only if verifiably implemented - -Commit the fixes: - -```bash -git add README.md ARCHITECTURE.md CLAUDE.md <%= paths.roadmap %> -git commit -m "docs: sync documentation with codebase state - -Updated by scheduled-docs-sync on $(date -u +%Y-%m-%dT%H:%M:%SZ)" -``` - -### 7. Push and Create PR - -```bash -git push -u origin $(git branch --show-current) -``` - -Ensure the `source:docs-sync` label exists: - -```bash -gh label create "source:docs-sync" --color "0075CA" --description "Auto-created by Scheduled Docs Sync skill" 2>/dev/null || true -``` - -If no existing PR was found in step 1: - -```bash -gh pr create \ - --title "docs: sync documentation with codebase state" \ - --body "## Summary - -- Automated documentation sync detected drift between codebase and docs -- <list specific drift items found and fixed> - -## Changes - -- <list files modified with brief description> - -## Source - -Auto-generated by the **Scheduled Docs Sync** skill. - ---- -*Run: $(date -u +%Y-%m-%dT%H:%M:%SZ)*" \ - --label "source:docs-sync" -``` - -If an existing PR was found, update it: - -```bash -gh pr edit <PR_NUMBER> --body "<updated body with new findings>" -``` - -### 8. Return to Main Branch - -```bash -git checkout main -``` - -### 9. Summary - -Report: -- **Drift detected**: yes/no -- **Files updated**: list of documentation files modified -- **Specific fixes**: list of what was changed (counts, trees, references, etc.) -- **PR**: URL of the created or updated PR (or "No drift — documentation in sync") - -## Rules - -- If no drift is detected, report "Documentation in sync" and STOP — do not create empty PRs -- **Never modify source code** — only documentation files (README.md, ARCHITECTURE.md, CLAUDE.md, <%= paths.roadmap %>) -- **Never mark ROADMAP items as done** unless the feature is verifiably implemented in the codebase -- **Preserve prose and commentary** — only update factual content (counts, tables, trees, lists) -- If a `source:docs-sync` PR is already open, update it instead of creating a new one -- Do NOT force push — if push fails, diagnose and report -- Always return to the `main` branch when finished diff --git a/packages/swarm/templates/skills/sdlc-pipeline-health.md b/packages/swarm/templates/skills/sdlc-pipeline-health.md deleted file mode 100644 index 44c2302f..00000000 --- a/packages/swarm/templates/skills/sdlc-pipeline-health.md +++ /dev/null @@ -1,216 +0,0 @@ -# Skill: SDLC Pipeline Health Check - -Validate the integrity of the autonomous SDLC infrastructure: skill files, governance hooks, CI workflows, GitHub labels, and build toolchain. Identifies gaps and creates issues for infrastructure problems. Designed for periodic scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If data is unavailable or ambiguous, proceed with available data and note limitations -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Verify Skill Files - -Check that all expected skill files exist and have valid structure: - -```bash -ls .claude/skills/*.md -``` - -For each `.md` file, verify it contains a `# Skill:` heading: - -```bash -grep -l "^# Skill:" .claude/skills/*.md -``` - -Report any skill files missing the heading as malformed. - -### 3. Verify Governance Hooks - -Check that Claude Code hooks are configured: - -```bash -cat .claude/settings.json 2>/dev/null -``` - -Verify the JSON contains: -- A `PreToolUse` hook entry with a command referencing `claude-hook` or `agentguard` -- A `PostToolUse` hook entry (optional but recommended) - -If hooks are missing, flag as **CRITICAL**. - -### 4. Verify Policy File - -```bash -ls <%= paths.policy %> 2>/dev/null && echo "Policy file exists" || echo "MISSING: <%= paths.policy %>" -``` - -If the policy file exists, verify it's valid YAML: - -```bash -node -e "const fs = require('fs'); const yaml = require('yaml'); yaml.parse(fs.readFileSync('<%= paths.policy %>', 'utf8')); console.log('Valid YAML')" 2>/dev/null || echo "YAML parse failed or yaml module not available" -``` - -### 5. Verify CI Workflows - -Check that all expected CI workflow files exist: - -```bash -ls .github/workflows/size-check.yml 2>/dev/null && echo "size-check.yml: OK" || echo "MISSING: size-check.yml" -ls .github/workflows/publish.yml 2>/dev/null && echo "publish.yml: OK" || echo "MISSING: publish.yml" -ls .github/workflows/codeql.yml 2>/dev/null && echo "codeql.yml: OK" || echo "MISSING: codeql.yml" -``` - -### 6. Verify GitHub Labels - -Check that all required labels exist on the repository: - -```bash -gh label list --json name --jq '.[].name' | sort -``` - -Verify these labels exist: -- Status: `status:pending`, `status:in-progress`, `status:review` -- Priority: `priority:critical`, `priority:high`, `priority:medium`, `priority:low` -- Task: `task:implementation`, `task:bug-fix`, `task:refactor`, `task:test-generation`, `task:documentation` -- Role: `role:developer`, `role:architect`, `role:auditor` -- Source: `source:backlog-steward`, `source:docs-sync`, `source:governance-audit`, `source:security-audit`, `source:sdlc-health` - -Create any missing labels: - -```bash -gh label create "<label-name>" --color "<color>" --description "<description>" 2>/dev/null || true -``` - -Use these colors: -- `status:*` → `0E8A16` (green) -- `priority:*` → `D93F0B` (red) -- `task:*` → `FBCA04` (yellow) -- `role:*` → `5319E7` (purple) -- `source:*` → `C5DEF5` (light blue) - -### 7. Verify Build Toolchain - -Run the build and test suite to verify the toolchain is healthy: - -```bash -pnpm build -``` - -Report build result (pass/fail). - -```bash -ppnpm test -``` - -Report test result (pass count, fail count). - -```bash -pnpm test -``` - -Report JS test result (pass count, fail count). - -```bash -pnpm lint -``` - -Report lint result (clean or error count). - -```bash -pnpm format -``` - -Report format result (clean or issue count). - -### 8. Check Telemetry Directories - -Verify telemetry output paths exist: - -```bash -ls -d .agentguard/events/ 2>/dev/null && echo "Events dir: OK" || echo "MISSING: .agentguard/events/" -ls -d logs/ 2>/dev/null && echo "Logs dir: OK" || echo "MISSING: logs/" -``` - -Create missing directories: - -```bash -mkdir -p .agentguard/events logs -``` - -### 9. Generate Health Report - -Compile results into a structured report: - -``` -## SDLC Pipeline Health Report - -**Date**: <timestamp> - -| Component | Status | Details | -|-----------|--------|---------| -| Skill files | OK/WARN | N files, N valid | -| Governance hooks | OK/CRITICAL | PreToolUse: yes/no, PostToolUse: yes/no | -| Policy file | OK/CRITICAL | <%= paths.policy %>: exists/missing | -| CI workflows | OK/WARN | N/3 present | -| GitHub labels | OK/WARN | N/N present, N created | -| Build | OK/FAIL | pass/fail | -| TypeScript tests | OK/FAIL | N pass / N fail | -| JS tests | OK/FAIL | N pass / N fail | -| Lint | OK/WARN | clean / N errors | -| Format | OK/WARN | clean / N issues | -| Telemetry dirs | OK/CREATED | present / created | -``` - -### 10. Create Issue (if problems found) - -If any component has CRITICAL or FAIL status, check for an existing health issue: - -```bash -gh issue list --state open --label "source:sdlc-health" --json number,title --limit 1 -``` - -Ensure the label exists: - -```bash -gh label create "source:sdlc-health" --color "C5DEF5" --description "Auto-created by SDLC Pipeline Health skill" 2>/dev/null || true -``` - -If an existing issue is open, comment with the new report. If none exists, create one: - -```bash -gh issue create \ - --title "sdlc-health: <summary of critical finding>" \ - --body "<full health report>" \ - --label "source:sdlc-health" --label "<%= labels.high %>" -``` - -### 11. Summary - -Report: -- **Overall status**: HEALTHY / DEGRADED / BROKEN -- **Components**: pass/fail count -- **Actions taken**: labels created, directories created, issue filed -- If all components pass: "SDLC pipeline healthy — all checks passed" - -## Rules - -- **Never modify source code, policy files, or CI workflows** — only create missing labels and telemetry directories -- **Never close existing health issues** — only create new ones or comment on existing open ones -- If build or tests fail, report the failure but do NOT attempt to fix it — that's a separate workflow -- If `gh` CLI is not authenticated, skip label verification and issue creation but still run local checks -- Create missing telemetry directories silently (this is always safe) diff --git a/packages/swarm/templates/skills/security-code-scan.md b/packages/swarm/templates/skills/security-code-scan.md deleted file mode 100644 index 12b9333e..00000000 --- a/packages/swarm/templates/skills/security-code-scan.md +++ /dev/null @@ -1,180 +0,0 @@ -# Skill: Security Code Scan - -Perform static security analysis on the AgentGuard source code: scan for hardcoded secrets, unsafe patterns, path traversal risks, and input validation gaps. Complements `dependency-security-audit` which focuses on dependencies. Creates an issue if vulnerabilities are found. Designed for periodic scheduled execution. - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active. If governance cannot be activated, STOP. - -### 2. Scan for Hardcoded Secrets - -Search source files for patterns that indicate hardcoded credentials: - -```bash -grep -rn "password\s*=\s*['\"]" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true -grep -rn "secret\s*=\s*['\"]" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true -grep -rn "api[_-]?key\s*=\s*['\"]" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true -grep -rn "token\s*=\s*['\"]" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true -grep -rn "Bearer\s" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true -grep -rn "-----BEGIN.*PRIVATE KEY" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true -``` - -Also check configuration and example files: - -```bash -grep -rn "password\|secret\|api.key\|token" examples/ --include="*.ts" --include="*.json" --include="*.yaml" || true -``` - -Exclude false positives: references in type definitions, test fixtures with obvious dummy values, documentation strings. - -### 3. Scan for Unsafe Code Patterns - -Check for dangerous JavaScript/TypeScript patterns: - -```bash -# eval and Function constructor — arbitrary code execution -grep -rn "eval(" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true -grep -rn "new Function(" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true - -# Dynamic require — potential code injection -grep -rn "require(" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist | grep -v "import" || true - -# Shell command construction — command injection risk -grep -rn "exec(\|execSync(\|spawn(\|spawnSync(" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true - -# Template literals in shell commands — injection risk -grep -rn "exec\`\|execSync\`" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true -``` - -For each shell execution found, verify: -- Is the command string constructed from user/agent input? -- Is the input sanitized before interpolation? -- Could a malicious tool call inject shell metacharacters? - -### 4. Scan for Path Traversal Risks - -Check file adapter and filesystem operations for path traversal: - -```bash -grep -rn "path.join\|path.resolve\|readFile\|writeFile\|readdir\|mkdir\|unlink\|rmdir" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true -``` - -For each filesystem operation found, verify: -- Is the path validated against a whitelist or base directory? -- Could `../` sequences escape the intended directory? -- Is `path.normalize()` used before path comparison? - -Focus particularly on: -- `packages/adapters/src/file.ts` — file adapter handles file.read, file.write, file.delete -- `packages/kernel/src/aab.ts` — AAB normalizes paths from agent input -- `apps/cli/src/` — CLI commands accept user-provided paths - -### 5. Scan for Input Validation Gaps - -Check system boundaries where external data enters: - -```bash -# JSON parsing without try/catch -grep -rn "JSON.parse(" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true - -# stdin/process.argv handling -grep -rn "process.stdin\|process.argv" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true - -# File content used without validation -grep -rn "readFileSync\|readFile" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true -``` - -For each entry point, verify: -- Is JSON parsing wrapped in try/catch? -- Are CLI arguments validated before use? -- Are file contents validated against expected schemas? - -### 6. Check for Information Disclosure - -```bash -# Stack traces exposed to output -grep -rn "console.error(.*err\|console.log(.*stack" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist || true - -# Verbose error messages that could leak internals -grep -rn "throw new Error(" packages/ apps/ --include="*.ts" --exclude-dir=node_modules --exclude-dir=dist | head -20 -``` - -### 7. Generate Security Report - -Compile findings: - -``` -## Security Code Scan Report - -**Date**: <timestamp> -**Files scanned**: N - -### Findings Summary - -| Category | Count | Severity | -|----------|-------|----------| -| Hardcoded secrets | N | CRITICAL | -| Unsafe code patterns | N | HIGH | -| Path traversal risks | N | HIGH | -| Input validation gaps | N | MEDIUM | -| Information disclosure | N | LOW | - -### Detailed Findings - -#### <Category> - -| # | Severity | File:Line | Pattern | Risk | Recommendation | -|---|----------|-----------|---------|------|----------------| -| 1 | <level> | <file>:<line> | <code snippet> | <description> | <fix> | - -### Recommendations - -<Prioritized list of remediation actions> -``` - -### 8. Create or Update Issue (if findings exist) - -Check for an existing security scan issue: - -```bash -gh issue list --state open --label "source:security-scan" --json number,title --limit 1 -``` - -Ensure labels exist: - -```bash -gh label create "source:security-scan" --color "D93F0B" --description "Auto-created by Security Code Scan skill" 2>/dev/null || true -``` - -If critical or high findings exist, create or update an issue: - -```bash -gh issue create \ - --title "security-scan: <N> findings (<severity summary>)" \ - --body "<full security report>" \ - --label "source:security-scan" --label "<%= labels.high %>" -``` - -### 9. Summary - -Report: -- **Files scanned**: N -- **Findings**: N critical, N high, N medium, N low -- **Issue**: created/updated/none needed -- If clean: "No security issues found in source code" - -## Rules - -- **Never modify source code** — only read and report. -- **Never close existing security issues** — only create new ones or comment on existing. -- Exclude false positives: type definitions, test fixtures with dummy values, documentation. -- Focus on `packages/*/src/` and `apps/*/src/` directories — do not scan `node_modules/`, `dist/`, or `.git/`. -- Cap detailed findings at 20 items per category to keep reports actionable. -- If `gh` CLI is not authenticated, still generate the report to console but skip issue creation. -- Differentiate from `dependency-security-audit` — this skill scans SOURCE CODE, not dependencies. diff --git a/packages/swarm/templates/skills/sprint-planning.md b/packages/swarm/templates/skills/sprint-planning.md deleted file mode 100644 index 319628f3..00000000 --- a/packages/swarm/templates/skills/sprint-planning.md +++ /dev/null @@ -1,301 +0,0 @@ -# Skill: Sprint Planning - -Analyze the full issue backlog, open PRs, ROADMAP phases, governance risk data, and recent activity to produce a prioritized sprint plan. Apply priority labels to unlabeled issues so the Coder Agent picks the right work next. Designed for daily scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If data is unavailable or ambiguous, proceed with available data and note limitations -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Collect Governance Context - -Read cross-session governance data to inform prioritization: - -```bash -<%= paths.cli %> analytics --format json 2>/dev/null | head -100 -``` - -Extract: -- **Current escalation level**: NORMAL / ELEVATED / HIGH / LOCKDOWN -- **Risk score** (0-100) and **risk level** -- **Recent denial trends**: increasing, stable, or decreasing -- **Top violation patterns**: which invariants or policy rules are triggering most - -Also check current escalation state: - -```bash -cat <%= paths.logs %> 2>/dev/null | grep -i "escalat\|StateChanged" | tail -5 -``` - -If analytics is not available, note "Governance context: not available" and proceed with standard prioritization. - -### 3. Snapshot the Backlog - -Fetch all open issues with full metadata: - -```bash -gh issue list --state open --limit 100 --json number,title,body,labels,createdAt,updatedAt -``` - -Parse each issue to extract: -- **Issue number** and **title** -- **Labels** (status, priority, task type, role, source) -- **Dependencies** (from `## Dependencies` section or `#N` references in body) -- **File scope** (from `## File Scope` section if present) -- **Phase mapping** (infer from title, body, or ROADMAP cross-reference) - -Also check for existing sprint plan issues: - -```bash -gh issue list --state open --label "source:planning-agent" --json number,title -``` - -### 4. Snapshot In-Flight Work - -Fetch open PRs to understand what is actively being worked on: - -```bash -gh pr list --state open --json number,title,headRefName,labels,body,additions,deletions -``` - -Fetch recent CI run status: - -```bash -gh run list --limit 5 --json databaseId,conclusion,headBranch,createdAt -``` - -Note: -- PRs that reference issues (via `Closes #N` or `Implements #N`) indicate near-completion work -- Failing CI runs may indicate blocking issues - -### 5. Analyze Throughput - -Fetch recently closed issues and merged PRs to measure velocity: - -```bash -gh issue list --state closed --limit 20 --json number,title,closedAt,labels -gh pr list --state merged --limit 10 --json number,title,mergedAt,body -``` - -Calculate: -- **Issues closed in last 7 days** (throughput) -- **PRs merged in last 7 days** (delivery rate) -- **Average issue age** for open issues (staleness signal) - -### 6. Read ROADMAP - -Read `<%= paths.roadmap %>` to determine phase structure and current progress: - -```bash -cat <%= paths.roadmap %> -``` - -Identify: -- **Current phase**: The first phase that is not `COMPLETE` (currently Phase 3 — partially complete) -- **Remaining items in current phase**: Unchecked `- [ ]` items -- **Next phase**: The phase after current (Phase 4 — Plugin Ecosystem) -- **Phase ordering**: Issues should generally be completed in phase order - -### 7. Build Dependency Graph - -For each open issue, determine its dependencies: - -1. **Explicit dependencies**: Parse `## Dependencies` sections for `#N` references -2. **Implicit phase dependencies**: Phase 3 items before Phase 4 items before Phase 5 items -3. **PR linkage**: Issues with open PRs are in-flight, not available for new work - -For each dependency reference, check if it is resolved: - -```bash -gh issue view <DEP_NUMBER> --json state --jq '.state' -``` - -Classify each issue as: -- **Ready**: All dependencies resolved, no open PR, status is `pending` -- **Blocked**: Has unresolved dependencies -- **In-flight**: Has an open PR or is `status:in-progress` -- **Stale candidate**: Open for >30 days with no activity - -### 8. Prioritize Unlabeled Issues - -For issues that lack a `priority:*` label, assign priority using these signals (in order): - -| Signal | Priority | -|--------|----------| -| CI is failing and this issue relates to the failure | `priority:critical` | -| Issue has an open PR (near completion) | `priority:high` | -| Issue is in the current ROADMAP phase (Phase 3) | `priority:high` | -| Issue is documentation debt | `priority:medium` | -| Issue is an entry point to next phase (Phase 4) | `priority:medium` | -| Issue is in a future phase (Phase 5+) | `priority:low` | -| No clear signal | Do not label (leave for human review) | - -**Governance risk adjustment**: If the current escalation level is ELEVATED or higher, deprioritize issues with high estimated blast radius (16+ files in scope). If escalation is HIGH, only label issues with small file scope as `priority:high` or above. - -Apply labels: - -```bash -gh issue edit <N> --add-label "priority:<level>" -``` - -Cap at **10 label changes per run** to avoid spamming. - -### 9. Identify Stale or Obsolete Issues - -For each open issue, check for staleness indicators: - -- **File paths referenced in the issue that no longer exist** — check with `ls` or `test -f` -- **Issues that reference `src/agentguard/`** — this directory was removed in a restructure -- **Issues that may have been resolved by recently merged PRs** — cross-reference PR bodies for `Closes #N` or `Fixes #N` - -For each stale candidate, add a comment (do NOT close the issue): - -```bash -gh issue comment <N> --body "**Planning Agent**: This issue may be stale or obsolete. -- **Reason**: <specific reason> -- **Recommendation**: <close / reclassify / update> - -*Analysis by sprint-planning on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" -``` - -Cap at **3 staleness comments per run** to avoid noise. - -### 10. Generate Sprint Plan - -Compose a structured sprint plan in markdown with these sections: - -**Header**: -- Generation timestamp -- HEAD commit SHA -- Open issue count, open PR count -- Current ROADMAP phase - -**Governance Context**: -| Metric | Value | -|--------|-------| -| Escalation level | NORMAL / ELEVATED / HIGH / LOCKDOWN | -| Risk score | <N>/100 | -| Recent denial trend | increasing / stable / decreasing | -| Top violation | <invariant or policy rule name> | - -**Ready Now** (table): -| Priority | Issue | Title | Package/Theme | Risk Estimate | Complexity Estimate | -Sorted by priority (critical > high > medium > low), then by issue age (oldest first). - -**Blocked** (table): -| Issue | Title | Blocked By | Notes | - -**Recommended Sequence** (numbered list): -The top 5-7 issues that should be worked next, in order, with brief reasoning. Factor in governance risk — prefer lower-blast-radius issues when escalation is elevated. - -**Issues to Close or Reclassify** (list): -Issues identified as stale/obsolete with reasoning. - -**Dependency Graph** (ASCII): -Show phase-level dependencies and any cross-issue dependency chains. - -**Backlog Health Metrics**: -- Total open issues -- Issues without priority labels (before and after this run) -- Issues without status labels -- Issues older than 30 days -- Throughput: issues closed / PRs merged in last 7 days -- CI health: last 5 runs pass/fail -- Governance risk score and escalation level - -### 11. Route Output (Report Routing Protocol) - -Apply the `report-routing` protocol. Sprint plans are normally REPORT-tier (routine scheduled output). - -**Write the sprint plan to a local file**: - -```bash -mkdir -p .agentguard/reports -cat > .agentguard/reports/planning-agent-$(date +%Y-%m-%d).md <<'REPORT_EOF' -<sprint plan markdown> -REPORT_EOF -``` - -**If critical blockers detected** (e.g., all work blocked, no actionable issues, system in LOCKDOWN) → also create an ALERT issue: - -```bash -gh issue create \ - --title "ALERT: Sprint blocked — $(date +%Y-%m-%d)" \ - --body "<blocker details>" \ - --label "source:planning-agent" --label "<%= labels.critical %>" --label "<%= labels.pending %>" -``` - -Close any previous sprint plan issues that are still open: - -```bash -PREV=$(gh issue list --state open --label "source:planning-agent" --json number --jq '.[].number' 2>/dev/null) -for num in $PREV; do - gh issue close "$num" --comment "Superseded — sprint plans now written to .agentguard/reports/" 2>/dev/null || true -done -``` - -### 12. Update Swarm State - -After publishing the sprint plan, update `<%= paths.swarmState %>`: - -```bash -cat <%= paths.swarmState %> 2>/dev/null || echo '{}' -``` - -Update/create the file with: -- `version`: 1 -- `lastUpdated`: current ISO timestamp -- `updatedBy`: "planning-agent" -- `currentPhase`: derived from <%= paths.roadmap %> (the first phase not marked COMPLETE) -- `priorities`: array of top 5 prioritized issue objects with `issueNumber` and `priority` fields -- `documentHashes`: object with keys for <%= paths.roadmap %> — use the first 8 chars of `sha256sum` output for each - -Preserve any fields written by other agents (e.g., `openAgentPRs`, `prQueueHealthy` from Observability Agent). Only overwrite the fields listed above. - -```bash -mkdir -p .agentguard -# Write the updated swarm-state.json -``` - -### 13. Summary - -Report: -- **Issues analyzed**: N -- **Priority labels applied**: N (list which issues got which priority) -- **Stale issues flagged**: N -- **Sprint plan issue created**: #N -- **Previous plan closed**: #N (or "none") -- **Governance context**: escalation level, risk score, denial trend -- **Top recommendation**: Brief statement of the single most important thing to work on next - -## Rules - -- **Sprint plans go to `.agentguard/reports/`, NOT GitHub issues** — follow the report-routing protocol -- Create a maximum of **1 alert issue per run** — only when sprint is critically blocked -- Apply a maximum of **10 priority labels per run** -- Add a maximum of **3 staleness comments per run** -- **Never close issues** — only comment with recommendations and close previous sprint plan issues -- **Never modify issue bodies** — only add labels and comments -- **Never create new work issues** — that is the Backlog Steward's job -- **Never assign issues** — that is the Coder Agent's job via `claim-issue` -- If `gh` CLI is not authenticated, report the error and STOP -- If no open issues exist, report "Backlog empty — no planning needed" and STOP -- Do not re-label issues that already have a `priority:*` label — only label unlabeled issues -- When closing previous sprint plans, verify the issue is actually labeled `source:planning-agent` before closing -- When escalation is ELEVATED or higher, deprioritize high-blast-radius issues in the recommended sequence diff --git a/packages/swarm/templates/skills/stale-branch-janitor.md b/packages/swarm/templates/skills/stale-branch-janitor.md deleted file mode 100644 index 6b7bbef5..00000000 --- a/packages/swarm/templates/skills/stale-branch-janitor.md +++ /dev/null @@ -1,120 +0,0 @@ -# Skill: Stale Branch Janitor - -Scan for stale remote branches and abandoned PRs (no activity in 7+ days). Warn newly stale PRs with a comment and label, auto-close previously warned PRs that remain inactive. Designed for periodic scheduled execution. - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. Requires `gh` CLI authenticated with repo access. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Ensure Labels Exist - -Create the required labels if they don't already exist: - -```bash -gh label create "stale" --color "EDEDED" --description "No activity for 7+ days" 2>/dev/null || true -gh label create "source:stale-branch-janitor" --color "C5DEF5" --description "Auto-created by Stale Branch Janitor skill" 2>/dev/null || true -``` - -### 3. List Stale Open PRs - -Find open PRs with no activity in the last 7 days: - -```bash -gh pr list --state open --json number,title,updatedAt,headRefName,labels,author --limit 100 -``` - -Filter results: -- **Include**: PRs where `updatedAt` is more than 7 days ago -- **Exclude**: PRs targeting `main` or `master` as the head branch (these are the base, not head — head branches are the source) -- **Exclude**: PRs with any `source:` label from other scheduled agents (e.g., `source:coder-agent`, `source:security-audit`) — these are managed by other automation -- **Exclude**: PRs with a `do-not-close` label - -### 4. Auto-Close Previously Warned PRs (max 3) - -From the stale PRs found in step 3, identify those already labeled `stale`: - -For each (up to 3): - -1. Check if there has been **any new activity** (comments, commits, reviews) since the stale warning was posted. Use: - -```bash -gh pr view <PR_NUMBER> --json comments,reviews,commits --jq '{lastComment: .comments[-1].createdAt, lastReview: .reviews[-1].submittedAt}' -``` - -2. If **new activity exists** since the warning: remove the `stale` label and skip this PR: - -```bash -gh pr edit <PR_NUMBER> --remove-label "stale" -``` - -3. If **no new activity** since the warning: close the PR with a comment: - -```bash -gh pr comment <PR_NUMBER> --body "Closing this PR due to 7+ days of inactivity after a stale warning. If this work is still needed, feel free to reopen. - -*Auto-closed by Stale Branch Janitor*" - -gh pr close <PR_NUMBER> -``` - -### 5. Warn Newly Stale PRs (max 5) - -From the stale PRs found in step 3, identify those **not yet labeled** `stale`: - -For each (up to 5): - -```bash -gh pr comment <PR_NUMBER> --body "This PR has had no activity for 7+ days. It will be automatically closed on the next janitor run if no further activity occurs. - -To keep this PR open, push a commit, leave a comment, or add the \`do-not-close\` label. - -*Warning posted by Stale Branch Janitor*" - -gh pr edit <PR_NUMBER> --add-label "stale" --add-label "source:stale-branch-janitor" -``` - -### 6. Report Orphaned Stale Branches - -List remote branches with no associated open PR that have had no commits in 7+ days: - -```bash -git fetch --prune origin -git for-each-ref --sort=-committerdate --format='%(refname:short) %(committerdate:iso)' refs/remotes/origin/ | grep -v 'origin/main\|origin/master\|origin/HEAD' -``` - -For each branch, check if it has an associated open PR: - -```bash -gh pr list --head <BRANCH_NAME> --state open --json number --jq 'length' -``` - -Branches with no open PR and last commit older than 7 days are "orphaned stale branches." Report them in the summary but **do not delete them**. - -### 7. Summary - -Report: -- **Stale PRs found**: N total -- **PRs warned (newly stale)**: N (list PR numbers and titles) -- **PRs auto-closed**: N (list PR numbers and titles) -- **PRs revived (stale label removed)**: N (had new activity since warning) -- **Orphaned stale branches**: N (list branch names) -- **Skipped (agent-managed)**: N (PRs with `source:` labels from other agents) -- If clean: "No stale PRs or branches found — repo is tidy" - -## Rules - -- **Never delete branches** — only close PRs. Branch cleanup is left to the developer. -- **Never close PRs on `main` or `master`** — these are protected. -- **Never close PRs from other scheduled agents** — skip any PR with a `source:` label from another agent. -- **Max 5 warnings per run** — if more than 5 newly stale PRs exist, warn the first 5 and note the remainder in the summary. -- **Max 3 auto-closes per run** — if more than 3 previously warned PRs are still stale, close the first 3 and note the remainder. -- **Respect `do-not-close` label** — never warn or close a PR with this label. -- **Check for activity before closing** — if a previously warned PR has new commits, comments, or reviews since the warning, remove the `stale` label instead of closing. -- If `gh` CLI is not authenticated, report the error and STOP — do not proceed without GitHub access. -- If no stale items are found, report "Repo is tidy" and STOP. diff --git a/packages/swarm/templates/skills/start-governance-runtime.md b/packages/swarm/templates/skills/start-governance-runtime.md deleted file mode 100644 index 57a0ea56..00000000 --- a/packages/swarm/templates/skills/start-governance-runtime.md +++ /dev/null @@ -1,78 +0,0 @@ -# Skill: Start Governance Runtime - -Ensure the AgentGuard kernel is active and intercepting all tool calls before any development work begins. This skill MUST be invoked as the first step in any autonomous workflow. - -## Steps - -### 0. Build the CLI - -Ensure the AgentGuard CLI is compiled from the latest source before hooks reference it: - -```bash -pnpm build -``` - -If the build fails, STOP — governance hooks depend on the compiled CLI at `apps/cli/dist/bin.js`. - -### 1. Check Hook Registration - -Read the local Claude Code settings and verify the PreToolUse governance hook is installed with SQLite storage: - -```bash -cat .claude/settings.json 2>/dev/null -``` - -Look for a `PreToolUse` entry whose command contains `claude-hook` and `--store sqlite`. If the file does not exist, does not contain the hook, or the hook is missing `--store sqlite`, proceed to step 2. If it does, skip to step 3. - -### 2. Install Hooks - -Run the AgentGuard hook installer with SQLite storage: - -```bash -<%= paths.cli %> claude-init --remove 2>/dev/null; <%= paths.cli %> claude-init --store sqlite -``` - -This writes both PreToolUse (governance enforcement for all tools) and PostToolUse (Bash error monitoring) hooks into `.claude/settings.json`, configured to persist governance data to SQLite (`~/.agentguard/agentguard.db`). The `--remove` ensures any existing hooks without SQLite are replaced. - -If installation fails, STOP. Do not proceed with development work without governance. - -### 3. Verify Telemetry Directories - -Ensure the telemetry output paths exist: - -```bash -mkdir -p .agentguard logs -``` - -These directories are used by: -- `~/.agentguard/agentguard.db` — SQLite governance database (events, decisions, sessions) -- `<%= paths.logs %>` — aggregated telemetry records - -### 4. Verify Policy File - -Check that a governance policy is loaded: - -```bash -ls <%= paths.policy %> 2>/dev/null || ls agentguard.yml 2>/dev/null || ls .<%= paths.policy %> 2>/dev/null -``` - -If a policy file exists, governance rules are active. If no policy file is found, warn: "No policy file found — governance running in fail-open mode (allow all)." - -### 5. Confirm Governance Active - -Report the status: - -``` -Governance runtime active. -PreToolUse hooks: registered -Storage: SQLite (~/.agentguard/agentguard.db) -Telemetry paths: ready -Policy: <filename or "none (fail-open)"> -``` - -## Rules - -- This skill MUST be the first skill invoked in any autonomous workflow -- If hook installation fails, STOP — do not proceed with development work without governance -- Never modify `.claude/settings.json` manually — always use `<%= paths.cli %> claude-init` -- Never modify `<%= paths.policy %>` — this is the governance policy and is protected diff --git a/packages/swarm/templates/skills/sync-main.md b/packages/swarm/templates/skills/sync-main.md deleted file mode 100644 index d8da50ba..00000000 --- a/packages/swarm/templates/skills/sync-main.md +++ /dev/null @@ -1,27 +0,0 @@ -# Skill: Sync Main Branch - -Ensure the local `main` branch is up-to-date with the remote before starting any work. This prevents agents from operating on stale code when the scheduler creates a worktree from `main`. - -## Steps - -### 1. Fetch and Merge Remote Main - -```bash -git fetch origin main && git merge origin/main --ff-only -``` - -The `--ff-only` flag ensures a clean fast-forward. If the merge fails (e.g., local commits on main that diverge from origin), report the error and STOP — do not proceed with stale or conflicted state. - -### 2. Confirm Sync - -Report the current HEAD: - -```bash -git log --oneline -1 -``` - -## Rules - -- This skill MUST run before any other skill in scheduled task workflows -- If the fetch or merge fails, STOP — do not proceed with stale code -- Never force-push or reset main — only fast-forward merges are allowed diff --git a/packages/swarm/templates/skills/test-health-review.md b/packages/swarm/templates/skills/test-health-review.md deleted file mode 100644 index 27b36e26..00000000 --- a/packages/swarm/templates/skills/test-health-review.md +++ /dev/null @@ -1,261 +0,0 @@ -# Skill: Test Health Review - -Evaluate the health, coverage, and reliability of the test suite. Run tests, analyze coverage, detect regressions, identify untested code, and assess test quality. Publish a Test Health Report. Designed for daily scheduled execution. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If data is unavailable or ambiguous, proceed with available data and note limitations -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Build the Project - -Build must succeed before tests can run: - -```bash -pnpm build -``` - -If the build fails, record the error output and skip to Step 8 (Generate Report) with build failure as the primary finding. Do NOT attempt to fix the build — that is the Coder Agent's job. - -### 3. Run Tests (vitest) - -Run the vitest test suite and capture structured output: - -```bash -npx vitest run --reporter=verbose 2>&1 -``` - -Parse the output to extract: -- **Total tests**: Count of test cases -- **Passed**: Count of passing tests -- **Failed**: Count of failing tests (with names and error messages) -- **Skipped**: Count of skipped tests -- **Duration**: Total execution time -- **Per-file results**: Pass/fail status for each test file - -### 4. Run Coverage Analysis - -Run coverage using vitest's built-in coverage: - -```bash -pnpm test:coverage 2>&1 -``` - -Parse the output to extract: -- **Line coverage %**: Overall and per-file -- **Branch coverage %**: Overall and per-file -- **Function coverage %**: Overall and per-file -- **Uncovered lines**: File paths and line ranges with zero coverage -- **Threshold status**: Whether the 50% line coverage minimum is met - -### 5. Run Type Check - -Verify TypeScript strict mode compliance: - -```bash -npx tsc --noEmit 2>&1 -``` - -Parse output for: -- **Error count**: Total type errors -- **Error locations**: File paths and line numbers -- **Error categories**: Missing types, type mismatches, unused variables, etc. - -### 6. Analyze Test-to-Code Ratio - -Calculate the ratio of test code to source code: - -Count source files and test files: - -```bash -find packages/ apps/ -name "*.ts" -not -name "*.test.ts" -not -path "*/node_modules/*" -not -path "*/dist/*" | wc -l -find packages/ apps/ -name "*.test.*" -not -path "*/node_modules/*" | wc -l -``` - -Count lines of source code vs. test code: - -```bash -find packages/ apps/ -name "*.ts" -not -name "*.test.ts" -not -path "*/node_modules/*" -not -path "*/dist/*" -exec cat {} + | wc -l -find packages/ apps/ -name "*.test.*" -not -path "*/node_modules/*" -exec cat {} + | wc -l -``` - -Calculate: -- **Test-to-code ratio**: test lines / source lines -- **Test file coverage**: % of source modules that have a corresponding test file -- **Untested modules**: Source files with no matching test file in their package's `tests/` directory - -For each source file, check if a corresponding test exists in the same package: -- `packages/kernel/src/kernel.ts` → `packages/kernel/tests/kernel.test.ts` or similar -- `packages/events/src/bus.ts` → `packages/events/tests/event-bus.test.ts` or similar - -List all source files that have NO corresponding test file. - -### 7. Analyze Recent CI History - -Fetch recent CI runs to detect patterns: - -```bash -gh run list --limit 20 --json databaseId,conclusion,headBranch,createdAt,name -``` - -Calculate: -- **CI pass rate**: % of runs with conclusion "success" in last 20 runs -- **Failure frequency**: Runs that failed, grouped by failure type -- **Flaky signal**: Branches where the same commit has both pass and fail runs -- **Average CI duration**: If available from run metadata - -Also check for any currently failing CI on open PRs: - -```bash -gh pr list --state open --json number,title,statusCheckRollup --jq '.[] | select(.statusCheckRollup != null) | {number, title, checks: [.statusCheckRollup[] | {name: .name, conclusion: .conclusion}]}' -``` - -### 8. Generate Test Health Report - -Compose a structured report in markdown: - -**Header**: -- Generation timestamp (UTC) -- HEAD commit SHA -- Build status (success/failure) - -**Test Results Dashboard** (table): -| Suite | Total | Passed | Failed | Skipped | Duration | -Showing vitest results. - -**Failed Tests** (if any): -List each failing test with: -- Test file and test name -- Error message (first 3 lines) -- Severity assessment (regression vs. known failure) - -**Coverage Summary** (table): -| Metric | Current | Threshold | Status | -Showing line, branch, and function coverage vs. thresholds. - -**Lowest Coverage Files** (table, top 10): -| File | Line % | Branch % | Uncovered Lines | -Files sorted by lowest coverage first. - -**Untested Modules** (list): -Source files with no corresponding test file, grouped by directory. - -**Test-to-Code Ratio**: -- Overall ratio -- Per-package breakdown (kernel, events, policy, invariants, adapters, cli, core) -- Comparison note (healthy ratio is typically 0.8-1.5) - -**CI Pipeline Health**: -- Pass rate (last 20 runs) -- Failure pattern summary -- Flaky test signals -- Currently failing PRs - -**Type Safety**: -- Type error count -- Error locations (if any) - -**Recommendations** (numbered, max 5): -Top 5 actions to improve test health, prioritized by impact: -1. Fix failing tests (if any) -2. Add tests for untested modules (list specific files) -3. Improve coverage for lowest-coverage files -4. Address flaky tests (if detected) -5. Fix type errors (if any) - -### 9. Route Output (Report Routing Protocol) - -Apply the `report-routing` protocol to determine where output goes: - -**Assess severity**: Check if ANY of the following critical conditions exist: -- Test failures detected (any failing tests) -- Build failure -- CI pass rate below 50% -- Coverage dropped below threshold - -**If critical conditions exist → ALERT tier**: - -First, check if a tracking issue already exists: - -```bash -gh issue list --state open --label "source:test-agent" --label "<%= labels.critical %>" --json number,title -``` - -If failing tests are found and no existing tracking issue covers them, create ONE alert issue: - -```bash -gh issue create \ - --title "Test failures detected — $(date +%Y-%m-%d)" \ - --body "<failing test details with file paths and error messages>" \ - --label "source:test-agent" --label "<%= labels.critical %>" --label "task:bug" --label "<%= labels.pending %>" -``` - -Cap at **1 alert issue per run**. Do NOT create a separate "Test Health Report" issue. - -**If no critical conditions → REPORT tier**: - -Write the full report to a local file instead of creating a GitHub issue: - -```bash -mkdir -p .agentguard/reports -cat > .agentguard/reports/test-agent-$(date +%Y-%m-%d).md <<'REPORT_EOF' -<test health report markdown> -REPORT_EOF -``` - -Close any previous test health report issues that are still open (cleanup from before routing was implemented): - -```bash -PREV=$(gh issue list --state open --label "source:test-agent" --json number --jq '.[].number' 2>/dev/null) -for num in $PREV; do - gh issue close "$num" --comment "Superseded — reports now written to .agentguard/reports/" 2>/dev/null || true -done -``` - -**If all tests pass AND no findings above INFO → LOG tier**: - -```bash -mkdir -p .agentguard/logs -echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) [test-agent] All tests passing. Coverage: N%. CI pass rate: N%." >> .agentguard/logs/swarm.log -``` - -### 10. Summary - -Report: -- **Build status**: Success / Failure -- **Tests**: N passed / N failed / N skipped -- **Coverage**: N% lines (threshold: 50%) -- **Type errors**: N -- **CI pass rate**: N% (last 20 runs) -- **Untested modules**: N files -- **Test-to-code ratio**: N -- **Output routed to**: ALERT (issue #N) / REPORT (.agentguard/reports/test-agent-DATE.md) / LOG -- **Top recommendation**: Brief statement of the single most important test health action - -## Rules - -- Create a maximum of **1 alert issue per run** — only for critical findings (test failures, build failure, CI collapse) -- **Routine reports go to `.agentguard/reports/`, NOT GitHub issues** — follow the report-routing protocol -- **Never fix tests** — only report findings. Fixing is the Coder Agent's job. -- **Never modify source code** — this agent is read-only except for GitHub issues and report files -- **Never assign issues** — that is the Coder Agent's job -- If the build fails, still produce a report (with build failure as primary finding) -- If `gh` CLI is not authenticated, report the error and STOP -- Do not create duplicate alert issues — check for existing ones first -- Coverage analysis uses vitest's built-in coverage reporting. diff --git a/packages/swarm/templates/skills/triage-failing-ci.md b/packages/swarm/templates/skills/triage-failing-ci.md deleted file mode 100644 index a677f588..00000000 --- a/packages/swarm/templates/skills/triage-failing-ci.md +++ /dev/null @@ -1,283 +0,0 @@ -# Skill: Triage Failing CI - -Diagnose failed CI runs on open PR branches, check governance logs for related denials, apply minimal fixes, and push them. Keeps the pipeline unblocked so reviews and merges aren't stalled by lint errors, type mismatches, or broken tests. - -## Autonomy Directive - -This skill runs as an **unattended scheduled task**. No human is present to answer questions. - -- **NEVER pause to ask for clarification or confirmation** — make your best judgment and proceed -- **NEVER use AskUserQuestion or any interactive prompt** — all decisions must be made autonomously -- If a fix attempt fails after 2 tries, **skip and report** — do not keep retrying -- If governance activation fails, log the failure and **STOP** -- If `gh` CLI fails, log the error and **STOP** -- Default to the **safest option** in every ambiguous situation (skip > attempt) - -## Prerequisites - -Run `start-governance-runtime` first. All scheduled skills must operate under governance. - -## Steps - -### 0. Skip-if-Green Guard (execute FIRST) - -Before any other step, check if there are recent CI failures: - -```bash -gh run list --status failure --limit 5 --json databaseId --jq length -``` - -If the result is 0: output "All CI runs green. No triage needed." and **STOP immediately**. Do not start governance runtime or perform any further work. - -### 1. Start Governance Runtime - -Invoke the `start-governance-runtime` skill to ensure the AgentGuard kernel is active and intercepting all tool calls. If governance cannot be activated, STOP — do not proceed without governance. - -### 2. Find Failed CI Runs - -```bash -gh run list --status failure --limit 10 --json databaseId,headBranch,event,conclusion,createdAt,name -``` - -Filter results: -- **Only PR branches** — skip runs on `main` or `master` -- **Only runs created in the last 24 hours** — skip stale failures -- Select up to **3 failed runs** for this invocation - -If no failed runs match, report "No recent CI failures to triage" and STOP. - -### 3. Diagnose Each Failure - -For each selected failed run: - -#### 3a. Fetch the Failure Logs - -```bash -gh run view <RUN_ID> --log-failed -``` - -#### 3b. Identify the Associated PR - -```bash -gh pr list --head <HEAD_BRANCH> --state open --json number,title --jq '.[0]' -``` - -If no open PR exists for the branch, skip this run. - -#### 3c. Check Governance Context - -Check if governance denials during the PR's development may have contributed to the failure: - -```bash -git fetch origin <HEAD_BRANCH> -git log origin/<HEAD_BRANCH> --oneline -5 -``` - -Look for governance event files associated with this branch: - -```bash -ls .agentguard/events/*.jsonl 2>/dev/null | head -5 -cat .agentguard/events/*.jsonl 2>/dev/null | grep "ActionDenied\|PolicyDenied" | grep -i "<HEAD_BRANCH>" | head -10 -``` - -If governance denials are found: -- Check if a denied file write or denied shell command correlates with the CI failure -- Example: if `file.write` was denied for a test file, and CI fails on tests — the denial may be the root cause -- Note governance-related root causes in the diagnostic comment - -#### 3d. Classify the Failure - -Read the log output and classify into one of these categories: - -| Category | Indicators | -|----------|------------| -| **lint** | ESLint errors, `pnpm lint` exit code | -| **format** | Prettier check failures, `pnpm format` exit code | -| **typecheck** | `tsc` errors, `TS\d+:` error codes | -| **test** | vitest/test failures, assertion errors, `pnpm test` exit code | -| **build** | `esbuild` errors, `pnpm build` exit code | -| **governance** | Failure correlates with a governance denial (from step 3c) | -| **other** | Network errors, timeout, infrastructure issues | - -If the category is **other**, skip this run — report it but do not attempt a fix. - -If the category is **governance**, do not attempt a fix — report the governance denial as the root cause and suggest policy review. - -### 4. Apply the Fix - -#### 4a. Check Out the Branch - -```bash -git fetch origin <HEAD_BRANCH> -git checkout <HEAD_BRANCH> -git pull origin <HEAD_BRANCH> -``` - -#### 4b. Fix by Category - -**Lint errors:** - -```bash -pnpm lint:fix -pnpm lint -``` - -If errors persist after auto-fix, read the specific errors and fix manually. - -**Format errors:** - -```bash -pnpm format:fix -pnpm format -``` - -**Type errors:** - -Read the `tsc` error output. Fix the specific type issues in the reported files. Then verify: - -```bash -pnpm ts:check -``` - -**Test failures:** - -Read the test output. Investigate the failing test and the code it exercises: -- If the test expectation is wrong (code is correct), update the test -- If the code has a bug, fix the code -- If a snapshot is stale, update the snapshot - -Then verify: - -```bash -ppnpm test -``` - -**Build errors:** - -Read the build output. Fix the reported issues (missing exports, syntax errors, etc.). Then verify: - -```bash -pnpm build -``` - -#### 4c. If the Fix Attempt Fails - -If you cannot resolve the failure after **2 attempts**, STOP fixing this run. Post a diagnostic comment on the PR instead (see step 5b). - -### 5. Verify, Commit, and Push - -#### 5a. Run the Full Suite - -```bash -pnpm build && pnpm ts:check && pnpm lint && pnpm format && ppnpm test && pnpm test -``` - -If any step fails that was not part of the original failure, do not push — you may have introduced a regression. Revert your changes and skip this run. - -#### 5b. Commit and Push - -```bash -git add <fixed-files> -git commit -m "fix(ci): resolve <CATEGORY> failure — <brief description>" -git push origin <HEAD_BRANCH> -``` - -#### 5c. Comment on the PR - -```bash -gh pr comment <PR_NUMBER> --body "**AgentGuard CI Triage Bot** — automated fix applied - -## Diagnosis - -- **Failed run**: <RUN_ID> -- **Category**: <CATEGORY> -- **Root cause**: <1-2 sentence explanation> -- **Governance context**: <any related denials or "no governance denials detected"> - -## Fix Applied - -<Brief description of what was changed and why> - -## Verification - -Full suite passed locally: build, typecheck, lint, format, ts:test, test - ---- -*Automated fix by triage-failing-ci skill on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" -``` - -If the fix attempt failed (step 4c), post a diagnostic comment instead: - -```bash -gh pr comment <PR_NUMBER> --body "**AgentGuard CI Triage Bot** — diagnosis only (could not auto-fix) - -## Diagnosis - -- **Failed run**: <RUN_ID> -- **Category**: <CATEGORY> -- **Root cause**: <1-2 sentence explanation> -- **Governance context**: <any related denials or "no governance denials detected"> -- **Why auto-fix failed**: <explanation> - -## Suggested Manual Fix - -<Specific steps the developer should take> - ---- -*Automated diagnosis by triage-failing-ci skill on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" -``` - -If the failure is **governance-related**: - -```bash -gh pr comment <PR_NUMBER> --body "**AgentGuard CI Triage Bot** — governance-related failure detected - -## Diagnosis - -- **Failed run**: <RUN_ID> -- **Category**: governance -- **Root cause**: A governance policy denial may have prevented required file changes -- **Denied actions**: <list of relevant denials from governance logs> - -## Recommended Action - -Review the governance policy to determine if the denial was intentional: -- Run \`<%= paths.cli %> inspect --last\` to see full decision history -- Check if the denied action is necessary for CI to pass -- If the denial was correct, the implementation approach needs adjustment -- If the denial was overly restrictive, consider a policy update - ---- -*Automated diagnosis by triage-failing-ci skill on $(date -u +%Y-%m-%dT%H:%M:%SZ)*" -``` - -### 6. Return to Original Branch - -After processing all runs: - -```bash -git checkout - -``` - -### 7. Summary - -Report: -- **Runs triaged**: N (list run IDs, branches, and categories) -- **Fixes pushed**: N (list PR numbers and commit messages) -- **Governance-related failures**: N (list branches and denied actions) -- **Diagnosis only (unfixable)**: N (list PR numbers and reasons) -- **Skipped (stale/no PR/other)**: N - -## Rules - -- Fix a maximum of **3 CI failures per run** -- **Never fix failures on `main` or `master`** — only PR branches -- **Never force push** — always regular push -- **Never modify tests to make them pass if the code is wrong** — fix the code instead -- **Never skip or disable tests, lint rules, or checks** to make CI green -- **Never push if the full suite doesn't pass** — revert and report instead -- Only commit files that are directly related to the CI fix — do not sneak in unrelated changes -- Skip runs older than 24 hours -- If `gh` CLI is not authenticated, report the error and STOP -- If the branch has merge conflicts, skip it and report the conflict in a PR comment -- For governance-related failures, report but do NOT attempt to bypass the governance policy diff --git a/packages/swarm/tests/config.test.ts b/packages/swarm/tests/config.test.ts deleted file mode 100644 index cf0672ae..00000000 --- a/packages/swarm/tests/config.test.ts +++ /dev/null @@ -1,97 +0,0 @@ -import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { mkdtempSync, rmSync, writeFileSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { loadDefaultConfig, loadConfig } from '../src/config.js'; - -describe('loadDefaultConfig', () => { - it('returns a valid SwarmConfig with all required fields', () => { - const config = loadDefaultConfig(); - - expect(config.swarm).toBeDefined(); - expect(Array.isArray(config.swarm.tiers)).toBe(true); - expect(config.swarm.tiers.length).toBeGreaterThan(0); - expect(config.swarm.schedules).toBeDefined(); - expect(config.swarm.paths).toBeDefined(); - expect(config.swarm.labels).toBeDefined(); - expect(config.swarm.thresholds).toBeDefined(); - }); - - it('includes all standard tiers', () => { - const config = loadDefaultConfig(); - expect(config.swarm.tiers).toContain('core'); - expect(config.swarm.tiers).toContain('governance'); - expect(config.swarm.tiers).toContain('ops'); - expect(config.swarm.tiers).toContain('quality'); - }); - - it('includes default paths', () => { - const config = loadDefaultConfig(); - expect(config.swarm.paths.policy).toBe('agentguard.yaml'); - expect(config.swarm.paths.roadmap).toBe('ROADMAP.md'); - }); - - it('includes default thresholds', () => { - const config = loadDefaultConfig(); - expect(typeof config.swarm.thresholds.maxOpenPRs).toBe('number'); - expect(typeof config.swarm.thresholds.prStaleHours).toBe('number'); - expect(typeof config.swarm.thresholds.blastRadiusHigh).toBe('number'); - }); -}); - -describe('loadConfig', () => { - let tmpDir: string; - - beforeEach(() => { - tmpDir = mkdtempSync(join(tmpdir(), 'swarm-config-')); - }); - - afterEach(() => { - rmSync(tmpDir, { recursive: true, force: true }); - }); - - it('returns defaults when no config file exists', () => { - const config = loadConfig(tmpDir); - const defaults = loadDefaultConfig(); - expect(config).toEqual(defaults); - }); - - it('merges user config with defaults', () => { - const userYaml = ` -swarm: - thresholds: - maxOpenPRs: 10 -`; - writeFileSync(join(tmpDir, 'agentguard-swarm.yaml'), userYaml); - const config = loadConfig(tmpDir); - - expect(config.swarm.thresholds.maxOpenPRs).toBe(10); - // Other defaults should be preserved - expect(config.swarm.paths.policy).toBe('agentguard.yaml'); - }); - - it('allows overriding schedules', () => { - const userYaml = ` -swarm: - schedules: - coder-agent: '0 8 * * *' -`; - writeFileSync(join(tmpDir, 'agentguard-swarm.yaml'), userYaml); - const config = loadConfig(tmpDir); - - expect(config.swarm.schedules['coder-agent']).toBe('0 8 * * *'); - }); - - it('allows overriding tiers', () => { - const userYaml = ` -swarm: - tiers: - - core - - quality -`; - writeFileSync(join(tmpDir, 'agentguard-swarm.yaml'), userYaml); - const config = loadConfig(tmpDir); - - expect(config.swarm.tiers).toEqual(['core', 'quality']); - }); -}); diff --git a/packages/swarm/tests/loop-guards.test.ts b/packages/swarm/tests/loop-guards.test.ts deleted file mode 100644 index d9798fe7..00000000 --- a/packages/swarm/tests/loop-guards.test.ts +++ /dev/null @@ -1,120 +0,0 @@ -import { describe, it, expect } from 'vitest'; -import { checkLoopGuards } from '../src/loop-guards.js'; -import type { LoopGuardConfig, SquadState } from '../src/types.js'; - -const defaultGuards: LoopGuardConfig = { - maxOpenPRsPerSquad: 3, - maxRetries: 3, - maxBlastRadius: 20, - maxRunMinutes: 10, -}; - -describe('loop guards', () => { - it('passes when all guards clear', () => { - const state: SquadState = { - squad: 'kernel', - sprint: { goal: 'test', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 1, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - const result = checkLoopGuards(defaultGuards, state, { - retryCount: 0, - predictedFileChanges: 5, - runStartTime: Date.now(), - }); - expect(result.allowed).toBe(true); - expect(result.violations).toHaveLength(0); - }); - - it('fails budget guard when too many PRs open', () => { - const state: SquadState = { - squad: 'kernel', - sprint: { goal: 'test', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 4, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - const result = checkLoopGuards(defaultGuards, state, { - retryCount: 0, - predictedFileChanges: 5, - runStartTime: Date.now(), - }); - expect(result.allowed).toBe(false); - expect(result.violations).toContain('budget'); - }); - - it('fails retry guard after 3 retries', () => { - const state: SquadState = { - squad: 'kernel', - sprint: { goal: 'test', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 0, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - const result = checkLoopGuards(defaultGuards, state, { - retryCount: 4, - predictedFileChanges: 5, - runStartTime: Date.now(), - }); - expect(result.allowed).toBe(false); - expect(result.violations).toContain('retry'); - }); - - it('fails blast radius guard when too many files', () => { - const state: SquadState = { - squad: 'kernel', - sprint: { goal: 'test', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 0, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - const result = checkLoopGuards(defaultGuards, state, { - retryCount: 0, - predictedFileChanges: 25, - runStartTime: Date.now(), - }); - expect(result.allowed).toBe(false); - expect(result.violations).toContain('blast-radius'); - }); - - it('fails time guard when run exceeds limit', () => { - const state: SquadState = { - squad: 'kernel', - sprint: { goal: 'test', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 0, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - const result = checkLoopGuards(defaultGuards, state, { - retryCount: 0, - predictedFileChanges: 5, - runStartTime: Date.now() - 11 * 60 * 1000, // 11 min ago - }); - expect(result.allowed).toBe(false); - expect(result.violations).toContain('time'); - }); - - it('reports multiple violations', () => { - const state: SquadState = { - squad: 'kernel', - sprint: { goal: 'test', issues: [] }, - assignments: {}, - blockers: [], - prQueue: { open: 5, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - const result = checkLoopGuards(defaultGuards, state, { - retryCount: 4, - predictedFileChanges: 25, - runStartTime: Date.now() - 15 * 60 * 1000, - }); - expect(result.allowed).toBe(false); - expect(result.violations.length).toBeGreaterThanOrEqual(3); - }); -}); diff --git a/packages/swarm/tests/manifest.test.ts b/packages/swarm/tests/manifest.test.ts deleted file mode 100644 index ed5e075f..00000000 --- a/packages/swarm/tests/manifest.test.ts +++ /dev/null @@ -1,148 +0,0 @@ -import { describe, it, expect } from 'vitest'; -import { - loadManifest, - filterAgentsByTier, - resolveSchedule, - collectSkills, -} from '../src/manifest.js'; -import type { SwarmAgent, SwarmConfig } from '../src/types.js'; - -describe('loadManifest', () => { - it('returns a manifest with version and agents', () => { - const manifest = loadManifest(); - expect(manifest.version).toBeDefined(); - expect(Array.isArray(manifest.agents)).toBe(true); - expect(manifest.agents.length).toBeGreaterThan(0); - }); - - it('each agent has required fields', () => { - const manifest = loadManifest(); - for (const agent of manifest.agents) { - expect(typeof agent.id).toBe('string'); - expect(typeof agent.name).toBe('string'); - expect(typeof agent.tier).toBe('string'); - expect(typeof agent.cron).toBe('string'); - expect(Array.isArray(agent.skills)).toBe(true); - expect(typeof agent.promptTemplate).toBe('string'); - expect(typeof agent.description).toBe('string'); - } - }); -}); - -describe('filterAgentsByTier', () => { - const agents: SwarmAgent[] = [ - { - id: 'a1', - name: 'Agent 1', - tier: 'core', - cron: '* * * * *', - skills: [], - promptTemplate: 'a1', - description: 'desc', - }, - { - id: 'a2', - name: 'Agent 2', - tier: 'quality', - cron: '* * * * *', - skills: [], - promptTemplate: 'a2', - description: 'desc', - }, - { - id: 'a3', - name: 'Agent 3', - tier: 'ops', - cron: '* * * * *', - skills: [], - promptTemplate: 'a3', - description: 'desc', - }, - ]; - - it('filters agents by enabled tiers', () => { - const result = filterAgentsByTier(agents, ['core']); - expect(result).toHaveLength(1); - expect(result[0].id).toBe('a1'); - }); - - it('returns all agents when all tiers enabled', () => { - const result = filterAgentsByTier(agents, ['core', 'quality', 'ops']); - expect(result).toHaveLength(3); - }); - - it('returns empty when no tiers match', () => { - const result = filterAgentsByTier(agents, ['governance']); - expect(result).toHaveLength(0); - }); -}); - -describe('resolveSchedule', () => { - const agent: SwarmAgent = { - id: 'test-agent', - name: 'Test', - tier: 'core', - cron: '0 */2 * * *', - skills: [], - promptTemplate: 'test', - description: 'desc', - }; - - it('returns agent default cron when no override exists', () => { - const config = { - swarm: { - tiers: ['core' as const], - schedules: {}, - paths: {} as never, - labels: {} as never, - thresholds: {} as never, - }, - }; - expect(resolveSchedule(agent, config)).toBe('0 */2 * * *'); - }); - - it('returns overridden schedule when configured', () => { - const config = { - swarm: { - tiers: ['core' as const], - schedules: { 'test-agent': '0 8 * * *' }, - paths: {} as never, - labels: {} as never, - thresholds: {} as never, - }, - }; - expect(resolveSchedule(agent, config)).toBe('0 8 * * *'); - }); -}); - -describe('collectSkills', () => { - it('collects unique skills from all agents', () => { - const agents: SwarmAgent[] = [ - { - id: 'a1', - name: 'A1', - tier: 'core', - cron: '', - skills: ['run-tests', 'create-pr'], - promptTemplate: '', - description: '', - }, - { - id: 'a2', - name: 'A2', - tier: 'core', - cron: '', - skills: ['run-tests', 'review-pr'], - promptTemplate: '', - description: '', - }, - ]; - - const skills = collectSkills(agents); - expect(skills).toEqual(['create-pr', 'review-pr', 'run-tests']); // sorted - }); - - it('returns empty array for no agents', () => { - expect(collectSkills([])).toEqual([]); - }); -}); diff --git a/packages/swarm/tests/scaffolder.test.ts b/packages/swarm/tests/scaffolder.test.ts deleted file mode 100644 index 93be7f77..00000000 --- a/packages/swarm/tests/scaffolder.test.ts +++ /dev/null @@ -1,180 +0,0 @@ -import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { mkdtempSync, rmSync, existsSync, readFileSync, mkdirSync, writeFileSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { scaffold } from '../src/scaffolder.js'; -import { loadManifest, filterAgentsByTier, collectSkills } from '../src/manifest.js'; -import { loadDefaultConfig } from '../src/config.js'; - -describe('scaffolder', () => { - let tmpDir: string; - - beforeEach(() => { - tmpDir = mkdtempSync(join(tmpdir(), 'swarm-test-')); - }); - - afterEach(() => { - rmSync(tmpDir, { recursive: true, force: true }); - }); - - it('should scaffold skills and config to project root', () => { - const result = scaffold({ projectRoot: tmpDir }); - - expect(result.skillsWritten).toBeGreaterThan(0); - expect(result.configWritten).toBe(true); - expect(result.agents.length).toBeGreaterThan(0); - - // Config file should exist - expect(existsSync(join(tmpDir, 'agentguard-swarm.yaml'))).toBe(true); - - // Skills directory should exist with files - expect(existsSync(join(tmpDir, '.claude', 'skills'))).toBe(true); - }); - - it('should not overwrite existing skills without force', () => { - // Create an existing skill - const skillsDir = join(tmpDir, '.claude', 'skills'); - mkdirSync(skillsDir, { recursive: true }); - writeFileSync(join(skillsDir, 'discover-next-issue.md'), 'custom content', 'utf8'); - - const result = scaffold({ projectRoot: tmpDir }); - - // Should skip the existing file - expect(result.skillsSkipped).toBeGreaterThan(0); - - // Existing file should be unchanged - const content = readFileSync(join(skillsDir, 'discover-next-issue.md'), 'utf8'); - expect(content).toBe('custom content'); - }); - - it('should overwrite existing skills with force flag', () => { - const skillsDir = join(tmpDir, '.claude', 'skills'); - mkdirSync(skillsDir, { recursive: true }); - writeFileSync(join(skillsDir, 'discover-next-issue.md'), 'custom content', 'utf8'); - - scaffold({ projectRoot: tmpDir, force: true }); - - const content = readFileSync(join(skillsDir, 'discover-next-issue.md'), 'utf8'); - expect(content).not.toBe('custom content'); - }); - - it('should not recreate config if it already exists', () => { - writeFileSync(join(tmpDir, 'agentguard-swarm.yaml'), 'existing', 'utf8'); - - const result = scaffold({ projectRoot: tmpDir }); - - expect(result.configWritten).toBe(false); - const content = readFileSync(join(tmpDir, 'agentguard-swarm.yaml'), 'utf8'); - expect(content).toBe('existing'); - }); - - it('should filter agents by tier', () => { - const result = scaffold({ projectRoot: tmpDir, tiers: ['core'] }); - - for (const agent of result.agents) { - expect(agent.tier).toBe('core'); - } - expect(result.agents.length).toBeGreaterThan(0); - expect(result.agents.length).toBeLessThan(26); // Less than total - }); - - it('should render template variables in skills', () => { - scaffold({ projectRoot: tmpDir }); - - const skillsDir = join(tmpDir, '.claude', 'skills'); - const discoverSkill = readFileSync(join(skillsDir, 'discover-next-issue.md'), 'utf8'); - - // Should have default values, not template tokens - expect(discoverSkill).not.toContain('<%= paths.'); - expect(discoverSkill).not.toContain('<%= labels.'); - }); - - it('should apply custom config values to templates', () => { - // Write a custom config - writeFileSync( - join(tmpDir, 'agentguard-swarm.yaml'), - `swarm: - tiers: - - core - paths: - policy: custom-policy.yaml - roadmap: PLAN.md - swarmState: .governance/state.json - logs: governance/events.jsonl - cli: npx aguard - labels: - pending: 'todo' - inProgress: 'doing' - review: 'reviewing' - blocked: 'stuck' - critical: 'p0' - high: 'p1' - medium: 'p2' - low: 'p3' - developer: 'dev' - architect: 'arch' - auditor: 'audit' -`, - 'utf8' - ); - - scaffold({ projectRoot: tmpDir, force: true }); - - const skillsDir = join(tmpDir, '.claude', 'skills'); - const files = require('node:fs').readdirSync(skillsDir) as string[]; - const mdFiles = files.filter((f: string) => f.endsWith('.md')); - - // Check that at least one skill contains the custom values - let foundCustomPath = false; - for (const file of mdFiles) { - const content = readFileSync(join(skillsDir, file), 'utf8'); - if (content.includes('custom-policy.yaml')) { - foundCustomPath = true; - break; - } - } - expect(foundCustomPath).toBe(true); - }); -}); - -describe('manifest', () => { - it('should load the manifest', () => { - const manifest = loadManifest(); - - expect(manifest.version).toBe('1.0.0'); - expect(manifest.agents.length).toBeGreaterThan(0); - }); - - it('should filter agents by tier', () => { - const manifest = loadManifest(); - const coreAgents = filterAgentsByTier(manifest.agents, ['core']); - - expect(coreAgents.length).toBeGreaterThan(0); - for (const agent of coreAgents) { - expect(agent.tier).toBe('core'); - } - }); - - it('should collect unique skills', () => { - const manifest = loadManifest(); - const skills = collectSkills(manifest.agents); - - expect(skills.length).toBeGreaterThan(0); - // Should be sorted - const sorted = [...skills].sort(); - expect(skills).toEqual(sorted); - // No duplicates - expect(new Set(skills).size).toBe(skills.length); - }); -}); - -describe('config', () => { - it('should load default config', () => { - const config = loadDefaultConfig(); - - expect(config.swarm).toBeDefined(); - expect(config.swarm.tiers).toContain('core'); - expect(config.swarm.paths.policy).toBe('agentguard.yaml'); - expect(config.swarm.labels.pending).toBe('status:pending'); - }); -}); diff --git a/packages/swarm/tests/schema.test.ts b/packages/swarm/tests/schema.test.ts deleted file mode 100644 index 7c28fd14..00000000 --- a/packages/swarm/tests/schema.test.ts +++ /dev/null @@ -1,257 +0,0 @@ -import { describe, it, expect } from 'vitest'; -import { readFileSync } from 'node:fs'; -import { join, dirname } from 'node:path'; -import { fileURLToPath } from 'node:url'; -import { parse as parseYaml } from 'yaml'; -import { - validateSwarmManifest, - validateSquadManifest, - validateSwarmConfig, - SWARM_MANIFEST_SCHEMA, -} from '../src/schema.js'; -import { loadManifest } from '../src/manifest.js'; - -const __dirname = dirname(fileURLToPath(import.meta.url)); -const TEMPLATES_DIR = join(__dirname, '..', 'templates', 'config'); - -describe('SWARM_MANIFEST_SCHEMA', () => { - it('has a JSON Schema $schema field', () => { - expect(SWARM_MANIFEST_SCHEMA.$schema).toBe('https://json-schema.org/draft/2020-12/schema'); - }); - - it('has title and description', () => { - expect(SWARM_MANIFEST_SCHEMA.title).toBeDefined(); - expect(SWARM_MANIFEST_SCHEMA.description).toBeDefined(); - }); -}); - -describe('validateSwarmManifest', () => { - it('validates the embedded manifest.json', () => { - const manifest = loadManifest(); - const result = validateSwarmManifest(manifest); - expect(result.valid).toBe(true); - expect(result.errors).toHaveLength(0); - }); - - it('rejects an empty object', () => { - const result = validateSwarmManifest({}); - expect(result.valid).toBe(false); - expect(result.errors.length).toBeGreaterThan(0); - expect(result.errors.some((e) => e.message.includes('Required'))).toBe(true); - }); - - it('rejects manifest with invalid version format', () => { - const result = validateSwarmManifest({ version: 'bad', agents: [] }); - expect(result.valid).toBe(false); - expect(result.errors.some((e) => e.path === '$.version')).toBe(true); - }); - - it('rejects manifest with empty agents array', () => { - const result = validateSwarmManifest({ version: '1.0.0', agents: [] }); - expect(result.valid).toBe(false); - expect(result.errors.some((e) => e.path === '$.agents')).toBe(true); - }); - - it('rejects agent with invalid tier', () => { - const result = validateSwarmManifest({ - version: '1.0.0', - agents: [ - { - id: 'test-agent', - name: 'Test Agent', - tier: 'invalid-tier', - cron: '0 * * * *', - skills: [], - promptTemplate: 'test', - description: 'test', - }, - ], - }); - expect(result.valid).toBe(false); - expect(result.errors.some((e) => e.path.includes('tier'))).toBe(true); - }); - - it('rejects agent with missing required fields', () => { - const result = validateSwarmManifest({ - version: '1.0.0', - agents: [{ id: 'test' }], - }); - expect(result.valid).toBe(false); - expect(result.errors.some((e) => e.message.includes('Required'))).toBe(true); - }); - - it('accepts a minimal valid manifest', () => { - const result = validateSwarmManifest({ - version: '1.0.0', - agents: [ - { - id: 'test-agent', - name: 'Test Agent', - tier: 'core', - cron: '0 * * * *', - skills: ['test-skill'], - promptTemplate: 'Run tests', - description: 'Runs tests', - }, - ], - }); - expect(result.valid).toBe(true); - }); - - it('rejects non-object input', () => { - expect(validateSwarmManifest(null).valid).toBe(false); - expect(validateSwarmManifest('string').valid).toBe(false); - expect(validateSwarmManifest(42).valid).toBe(false); - }); -}); - -describe('validateSquadManifest', () => { - it('accepts a valid squad manifest', () => { - const agent = { - id: 'kernel-em', - rank: 'em', - driver: 'claude-code', - model: 'sonnet', - cron: '10 */3 * * *', - skills: ['sprint-management'], - }; - const result = validateSquadManifest({ - version: '1.0.0', - org: { director: { ...agent, id: 'director', rank: 'director' } }, - squads: { - kernel: { - name: 'Kernel', - repo: 'agent-guard', - em: agent, - agents: { sr: { ...agent, id: 'kernel-sr', rank: 'senior' } }, - }, - }, - loopGuards: { - maxOpenPRsPerSquad: 5, - maxRetries: 3, - maxBlastRadius: 20, - maxRunMinutes: 15, - }, - }); - expect(result.valid).toBe(true); - }); - - it('rejects squad manifest missing loopGuards', () => { - const result = validateSquadManifest({ - version: '1.0.0', - org: { director: { id: 'd', rank: 'director', driver: 'claude-code', model: 'opus', cron: '0 7 * * *', skills: [] } }, - squads: {}, - }); - expect(result.valid).toBe(false); - expect(result.errors.some((e) => e.path.includes('loopGuards'))).toBe(true); - }); - - it('rejects invalid agent rank', () => { - const result = validateSquadManifest({ - version: '1.0.0', - org: { - director: { - id: 'd', - rank: 'ceo', - driver: 'claude-code', - model: 'opus', - cron: '0 7 * * *', - skills: [], - }, - }, - squads: {}, - loopGuards: { maxOpenPRsPerSquad: 5, maxRetries: 3, maxBlastRadius: 20, maxRunMinutes: 15 }, - }); - expect(result.valid).toBe(false); - expect(result.errors.some((e) => e.path.includes('rank'))).toBe(true); - }); -}); - -describe('validateSwarmConfig', () => { - it('accepts a valid swarm config', () => { - const result = validateSwarmConfig({ - swarm: { - tiers: ['core', 'governance'], - schedules: {}, - paths: { - policy: 'agentguard.yaml', - roadmap: 'ROADMAP.md', - swarmState: 'swarm-state.json', - logs: 'logs/', - reports: 'reports/', - swarmLogs: 'swarm-logs/', - cli: 'npx agentguard', - }, - labels: { pending: 'pending' }, - thresholds: { - maxOpenPRs: 10, - prStaleHours: 48, - blastRadiusHigh: 20, - }, - }, - }); - expect(result.valid).toBe(true); - }); - - it('rejects config with invalid tier', () => { - const result = validateSwarmConfig({ - swarm: { - tiers: ['invalid-tier'], - schedules: {}, - paths: { - policy: 'p', - roadmap: 'r', - swarmState: 's', - logs: 'l', - reports: 'r', - swarmLogs: 's', - cli: 'c', - }, - labels: {}, - thresholds: { maxOpenPRs: 10, prStaleHours: 48, blastRadiusHigh: 20 }, - }, - }); - expect(result.valid).toBe(false); - }); - - it('rejects config missing required paths', () => { - const result = validateSwarmConfig({ - swarm: { - tiers: ['core'], - schedules: {}, - paths: { policy: 'p' }, - labels: {}, - thresholds: { maxOpenPRs: 10, prStaleHours: 48, blastRadiusHigh: 20 }, - }, - }); - expect(result.valid).toBe(false); - expect(result.errors.some((e) => e.message.includes('Required'))).toBe(true); - }); - - it('rejects empty object', () => { - const result = validateSwarmConfig({}); - expect(result.valid).toBe(false); - }); -}); - -// --------------------------------------------------------------------------- -// Cross-validation: shipped default templates must pass their schemas -// --------------------------------------------------------------------------- - -describe('default template cross-validation', () => { - it('agentguard-swarm.default.yaml passes validateSwarmConfig', () => { - const yaml = readFileSync(join(TEMPLATES_DIR, 'agentguard-swarm.default.yaml'), 'utf8'); - const parsed = parseYaml(yaml) as unknown; - const result = validateSwarmConfig(parsed); - expect(result.errors).toEqual([]); - expect(result.valid).toBe(true); - }); - - it('squad-manifest.default.yaml passes validateSquadManifest', () => { - const yaml = readFileSync(join(TEMPLATES_DIR, 'squad-manifest.default.yaml'), 'utf8'); - const parsed = parseYaml(yaml) as unknown; - const result = validateSquadManifest(parsed); - expect(result.errors).toEqual([]); - expect(result.valid).toBe(true); - }); -}); diff --git a/packages/swarm/tests/squad-manifest.test.ts b/packages/swarm/tests/squad-manifest.test.ts deleted file mode 100644 index 3c64c526..00000000 --- a/packages/swarm/tests/squad-manifest.test.ts +++ /dev/null @@ -1,64 +0,0 @@ -import { describe, it, expect } from 'vitest'; -import { loadSquadManifest } from '../src/squad-manifest.js'; -import { readFileSync } from 'node:fs'; -import { join, dirname } from 'node:path'; -import { fileURLToPath } from 'node:url'; - -const __dirname = dirname(fileURLToPath(import.meta.url)); - -describe('loadSquadManifest', () => { - it('loads the default manifest', () => { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - const manifest = loadSquadManifest(yaml); - expect(manifest.version).toBe('1.0.0'); - expect(manifest.org.director.rank).toBe('director'); - expect(manifest.org.director.driver).toBe('claude-code'); - }); - - it('parses all 3 squads', () => { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - const manifest = loadSquadManifest(yaml); - expect(Object.keys(manifest.squads)).toEqual(['kernel', 'cloud', 'qa', 'studio']); - }); - - it('each squad has em + 5 agents', () => { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - const manifest = loadSquadManifest(yaml); - for (const [name, squad] of Object.entries(manifest.squads)) { - expect(squad.em.rank).toBe('em'); - expect(Object.keys(squad.agents)).toHaveLength(5); - } - }); - - it('builds agent identity strings', () => { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - const manifest = loadSquadManifest(yaml); - const sr = manifest.squads.kernel.agents.senior; - const identity = `${sr.driver}:${sr.model}:kernel:${sr.rank}`; - expect(identity).toBe('copilot-cli:sonnet:kernel:senior'); - }); - - it('parses loop guard config', () => { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - const manifest = loadSquadManifest(yaml); - expect(manifest.loopGuards.maxOpenPRsPerSquad).toBe(3); - expect(manifest.loopGuards.maxRetries).toBe(3); - expect(manifest.loopGuards.maxBlastRadius).toBe(20); - expect(manifest.loopGuards.maxRunMinutes).toBe(10); - }); -}); diff --git a/packages/swarm/tests/squad-scaffold.test.ts b/packages/swarm/tests/squad-scaffold.test.ts deleted file mode 100644 index de8aad00..00000000 --- a/packages/swarm/tests/squad-scaffold.test.ts +++ /dev/null @@ -1,107 +0,0 @@ -import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { scaffoldSquad } from '../src/scaffolder.js'; -import { loadSquadManifest } from '../src/squad-manifest.js'; -import { readFileSync, mkdtempSync, rmSync, existsSync, writeFileSync, mkdirSync } from 'node:fs'; -import { join, dirname } from 'node:path'; -import { tmpdir } from 'node:os'; -import { fileURLToPath } from 'node:url'; - -const __dirname = dirname(fileURLToPath(import.meta.url)); - -function loadDefaultManifest() { - const yaml = readFileSync( - join(__dirname, '..', 'templates', 'config', 'squad-manifest.default.yaml'), - 'utf8', - ); - return loadSquadManifest(yaml); -} - -describe('scaffoldSquad', () => { - let dir: string; - - beforeEach(() => { - dir = mkdtempSync(join(tmpdir(), 'scaffold-')); - }); - - afterEach(() => { - rmSync(dir, { recursive: true, force: true }); - }); - - it('creates squad state directory', () => { - const manifest = loadDefaultManifest(); - scaffoldSquad(dir, 'kernel', manifest.squads.kernel); - - expect(existsSync(join(dir, '.agentguard', 'squads', 'kernel'))).toBe(true); - }); - - it('creates initial state.json with squad name', () => { - const manifest = loadDefaultManifest(); - scaffoldSquad(dir, 'kernel', manifest.squads.kernel); - - const statePath = join(dir, '.agentguard', 'squads', 'kernel', 'state.json'); - expect(existsSync(statePath)).toBe(true); - const state = JSON.parse(readFileSync(statePath, 'utf8')); - expect(state.squad).toBe('kernel'); - expect(state.sprint).toEqual({ goal: '', issues: [] }); - expect(state.assignments).toEqual({}); - expect(state.blockers).toEqual([]); - expect(state.prQueue).toEqual({ open: 0, reviewed: 0, mergeable: 0 }); - expect(typeof state.updatedAt).toBe('string'); - }); - - it('creates learnings.json as empty array', () => { - const manifest = loadDefaultManifest(); - scaffoldSquad(dir, 'kernel', manifest.squads.kernel); - - const learningsPath = join(dir, '.agentguard', 'squads', 'kernel', 'learnings.json'); - expect(existsSync(learningsPath)).toBe(true); - const learnings = JSON.parse(readFileSync(learningsPath, 'utf8')); - expect(learnings).toEqual([]); - }); - - it('does not overwrite existing state.json', () => { - const manifest = loadDefaultManifest(); - const squadDir = join(dir, '.agentguard', 'squads', 'kernel'); - mkdirSync(squadDir, { recursive: true }); - - const customState = JSON.stringify({ squad: 'kernel', custom: true }); - writeFileSync(join(squadDir, 'state.json'), customState, 'utf8'); - - scaffoldSquad(dir, 'kernel', manifest.squads.kernel); - - const content = readFileSync(join(squadDir, 'state.json'), 'utf8'); - expect(JSON.parse(content).custom).toBe(true); - }); - - it('does not overwrite existing learnings.json', () => { - const manifest = loadDefaultManifest(); - const squadDir = join(dir, '.agentguard', 'squads', 'kernel'); - mkdirSync(squadDir, { recursive: true }); - - const customLearnings = JSON.stringify([{ lesson: 'test' }]); - writeFileSync(join(squadDir, 'learnings.json'), customLearnings, 'utf8'); - - scaffoldSquad(dir, 'kernel', manifest.squads.kernel); - - const content = readFileSync(join(squadDir, 'learnings.json'), 'utf8'); - expect(JSON.parse(content)).toEqual([{ lesson: 'test' }]); - }); - - it('scaffolds multiple squads independently', () => { - const manifest = loadDefaultManifest(); - scaffoldSquad(dir, 'kernel', manifest.squads.kernel); - scaffoldSquad(dir, 'cloud', manifest.squads.cloud); - - expect(existsSync(join(dir, '.agentguard', 'squads', 'kernel', 'state.json'))).toBe(true); - expect(existsSync(join(dir, '.agentguard', 'squads', 'cloud', 'state.json'))).toBe(true); - - const kernelState = JSON.parse( - readFileSync(join(dir, '.agentguard', 'squads', 'kernel', 'state.json'), 'utf8'), - ); - const cloudState = JSON.parse( - readFileSync(join(dir, '.agentguard', 'squads', 'cloud', 'state.json'), 'utf8'), - ); - expect(kernelState.squad).toBe('kernel'); - expect(cloudState.squad).toBe('cloud'); - }); -}); diff --git a/packages/swarm/tests/squad-state.test.ts b/packages/swarm/tests/squad-state.test.ts deleted file mode 100644 index 427c23b2..00000000 --- a/packages/swarm/tests/squad-state.test.ts +++ /dev/null @@ -1,68 +0,0 @@ -import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { readSquadState, writeSquadState, readEMReport, writeEMReport, readDirectorBrief, writeDirectorBrief } from '../src/squad-state.js'; -import { mkdtempSync, rmSync, mkdirSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; - -describe('squad state', () => { - let dir: string; - - beforeEach(() => { - dir = mkdtempSync(join(tmpdir(), 'squad-')); - mkdirSync(join(dir, '.agentguard', 'squads', 'kernel'), { recursive: true }); - }); - - afterEach(() => { - rmSync(dir, { recursive: true, force: true }); - }); - - it('writes and reads squad state', () => { - const state = { - squad: 'kernel', - sprint: { goal: 'Go kernel Phase 2', issues: ['#860'] }, - assignments: { - senior: { current: '#860', status: 'implementing' }, - }, - blockers: [], - prQueue: { open: 1, reviewed: 0, mergeable: 0 }, - updatedAt: new Date().toISOString(), - }; - writeSquadState(dir, 'kernel', state); - const read = readSquadState(dir, 'kernel'); - expect(read?.squad).toBe('kernel'); - expect(read?.sprint.goal).toBe('Go kernel Phase 2'); - }); - - it('returns null for missing state', () => { - const read = readSquadState(dir, 'nonexistent'); - expect(read).toBeNull(); - }); - - it('writes and reads EM report', () => { - const report = { - squad: 'kernel', - timestamp: new Date().toISOString(), - health: 'green' as const, - summary: 'All clear', - blockers: [], - escalations: [], - metrics: { prsOpened: 2, prsMerged: 1, issuesClosed: 3, denials: 0, retries: 0 }, - }; - writeEMReport(dir, 'kernel', report); - const read = readEMReport(dir, 'kernel'); - expect(read?.health).toBe('green'); - }); - - it('writes and reads director brief', () => { - const brief = { - timestamp: new Date().toISOString(), - squads: {}, - escalationsForHuman: ['Need decision on Go vs Rust for hot path'], - overallHealth: 'yellow' as const, - }; - writeDirectorBrief(dir, brief); - const read = readDirectorBrief(dir); - expect(read?.overallHealth).toBe('yellow'); - expect(read?.escalationsForHuman).toHaveLength(1); - }); -}); diff --git a/packages/swarm/tests/types.test.ts b/packages/swarm/tests/types.test.ts deleted file mode 100644 index 0b72fc01..00000000 --- a/packages/swarm/tests/types.test.ts +++ /dev/null @@ -1,70 +0,0 @@ -import { describe, it, expect } from 'vitest'; -import type { - SquadManifest, - Squad, - SquadAgent, - SquadRank, - SquadState, - LoopGuardConfig, -} from '../src/types.js'; - -describe('Squad types', () => { - it('SquadAgent has driver, model, squad, rank fields', () => { - const agent: SquadAgent = { - id: 'kernel-senior', - rank: 'senior', - driver: 'copilot-cli', - model: 'sonnet', - cron: '0 */2 * * *', - skills: ['claim-issue', 'implement-issue', 'create-pr'], - }; - expect(agent.driver).toBe('copilot-cli'); - expect(agent.rank).toBe('senior'); - }); - - it('Squad contains em + 5 agents', () => { - const squad: Squad = { - name: 'kernel', - repo: 'agent-guard', - em: { - id: 'kernel-em', - rank: 'em', - driver: 'claude-code', - model: 'opus', - cron: '0 */3 * * *', - skills: ['squad-plan', 'squad-execute'], - }, - agents: { - 'product-lead': { id: 'kernel-pl', rank: 'product-lead', driver: 'claude-code', model: 'sonnet', cron: '0 6 * * *', skills: [] }, - architect: { id: 'kernel-arch', rank: 'architect', driver: 'claude-code', model: 'opus', cron: '0 */4 * * *', skills: [] }, - senior: { id: 'kernel-sr', rank: 'senior', driver: 'copilot-cli', model: 'sonnet', cron: '0 */2 * * *', skills: [] }, - junior: { id: 'kernel-jr', rank: 'junior', driver: 'copilot-cli', model: 'copilot', cron: '0 */2 * * *', skills: [] }, - qa: { id: 'kernel-qa', rank: 'qa', driver: 'copilot-cli', model: 'sonnet', cron: '0 */3 * * *', skills: [] }, - }, - }; - expect(Object.keys(squad.agents)).toHaveLength(5); - expect(squad.em.rank).toBe('em'); - }); - - it('SquadManifest has director + squads', () => { - const manifest: SquadManifest = { - version: '1.0.0', - org: { - director: { id: 'director', rank: 'director', driver: 'claude-code', model: 'opus', cron: '0 7,19 * * *', skills: [] }, - }, - squads: {}, - loopGuards: { - maxOpenPRsPerSquad: 3, - maxRetries: 3, - maxBlastRadius: 20, - maxRunMinutes: 10, - }, - }; - expect(manifest.org.director.rank).toBe('director'); - }); - - it('SquadRank includes all valid ranks', () => { - const ranks: SquadRank[] = ['director', 'em', 'product-lead', 'architect', 'senior', 'junior', 'qa']; - expect(ranks).toHaveLength(7); - }); -}); diff --git a/packages/swarm/tsconfig.json b/packages/swarm/tsconfig.json deleted file mode 100644 index 9cb4cda1..00000000 --- a/packages/swarm/tsconfig.json +++ /dev/null @@ -1,10 +0,0 @@ -{ - "extends": "../../tsconfig.base.json", - "compilerOptions": { - "outDir": "dist", - "rootDir": "src", - "resolveJsonModule": true - }, - "include": ["src"], - "exclude": ["dist", "tests", "templates"] -} diff --git a/packages/swarm/vitest.config.ts b/packages/swarm/vitest.config.ts deleted file mode 100644 index 04eb284c..00000000 --- a/packages/swarm/vitest.config.ts +++ /dev/null @@ -1,9 +0,0 @@ -import { defineConfig } from 'vitest/config'; - -export default defineConfig({ - test: { - include: ['tests/**/*.test.ts'], - environment: 'node', - testTimeout: 15_000, - }, -}); diff --git a/paper/agentguard-whitepaper.md b/paper/agentguard-whitepaper.md index c49d8642..28aa21af 100644 --- a/paper/agentguard-whitepaper.md +++ b/paper/agentguard-whitepaper.md @@ -700,10 +700,6 @@ agent-guard/ registry.ts # Renderer registry tui-renderer.ts # TUI renderer implementation tui-formatters.ts # TUI formatting helpers - swarm/src/ # @red-codes/swarm — Agent swarm templates - config.ts # Swarm configuration - manifest.ts # Swarm manifest parsing - scaffolder.ts # Swarm scaffolding telemetry/src/ # @red-codes/telemetry — Runtime telemetry and logging cloud-sink.ts # Cloud telemetry sink event-mapper.ts # Event mapping diff --git a/paper/diagrams/sdlc-swarm-control-plane.svg b/paper/diagrams/sdlc-swarm-control-plane.svg deleted file mode 100644 index 97423f9c..00000000 --- a/paper/diagrams/sdlc-swarm-control-plane.svg +++ /dev/null @@ -1,201 +0,0 @@ -<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 960 720" font-family="'Inter', 'Segoe UI', system-ui, -apple-system, sans-serif"> - <defs> - <!-- Neon glow filter --> - <filter id="glow" x="-20%" y="-20%" width="140%" height="140%"> - <feGaussianBlur stdDeviation="4" result="blur"/> - <feMerge> - <feMergeNode in="blur"/> - <feMergeNode in="SourceGraphic"/> - </feMerge> - </filter> - <!-- Subtle glow for center --> - <filter id="glow-strong" x="-30%" y="-30%" width="160%" height="160%"> - <feGaussianBlur stdDeviation="8" result="blur"/> - <feMerge> - <feMergeNode in="blur"/> - <feMergeNode in="SourceGraphic"/> - </feMerge> - </filter> - <!-- Soft glow for outer ring --> - <filter id="glow-soft" x="-10%" y="-10%" width="120%" height="120%"> - <feGaussianBlur stdDeviation="2" result="blur"/> - <feMerge> - <feMergeNode in="blur"/> - <feMergeNode in="SourceGraphic"/> - </feMerge> - </filter> - <!-- Arrow marker --> - <marker id="arrow" markerWidth="10" markerHeight="7" refX="10" refY="3.5" orient="auto" markerUnits="strokeWidth"> - <polygon points="0 0, 10 3.5, 0 7" fill="#22C55E" opacity="0.8"/> - </marker> - <!-- Dim arrow marker --> - <marker id="arrow-dim" markerWidth="10" markerHeight="7" refX="10" refY="3.5" orient="auto" markerUnits="strokeWidth"> - <polygon points="0 0, 10 3.5, 0 7" fill="#22C55E" opacity="0.4"/> - </marker> - <!-- State arrow --> - <marker id="state-arrow" markerWidth="8" markerHeight="6" refX="8" refY="3" orient="auto" markerUnits="strokeWidth"> - <polygon points="0 0, 8 3, 0 6" fill="#F59E0B" opacity="0.7"/> - </marker> - </defs> - - <!-- Background --> - <rect width="960" height="720" fill="#0D1117" rx="0"/> - - <!-- Subtle grid pattern --> - <g opacity="0.03" stroke="#22C55E"> - <line x1="0" y1="0" x2="0" y2="720" stroke-width="0.5"/><line x1="40" y1="0" x2="40" y2="720" stroke-width="0.5"/><line x1="80" y1="0" x2="80" y2="720" stroke-width="0.5"/><line x1="120" y1="0" x2="120" y2="720" stroke-width="0.5"/><line x1="160" y1="0" x2="160" y2="720" stroke-width="0.5"/><line x1="200" y1="0" x2="200" y2="720" stroke-width="0.5"/><line x1="240" y1="0" x2="240" y2="720" stroke-width="0.5"/><line x1="280" y1="0" x2="280" y2="720" stroke-width="0.5"/><line x1="320" y1="0" x2="320" y2="720" stroke-width="0.5"/><line x1="360" y1="0" x2="360" y2="720" stroke-width="0.5"/><line x1="400" y1="0" x2="400" y2="720" stroke-width="0.5"/><line x1="440" y1="0" x2="440" y2="720" stroke-width="0.5"/><line x1="480" y1="0" x2="480" y2="720" stroke-width="0.5"/><line x1="520" y1="0" x2="520" y2="720" stroke-width="0.5"/><line x1="560" y1="0" x2="560" y2="720" stroke-width="0.5"/><line x1="600" y1="0" x2="600" y2="720" stroke-width="0.5"/><line x1="640" y1="0" x2="640" y2="720" stroke-width="0.5"/><line x1="680" y1="0" x2="680" y2="720" stroke-width="0.5"/><line x1="720" y1="0" x2="720" y2="720" stroke-width="0.5"/><line x1="760" y1="0" x2="760" y2="720" stroke-width="0.5"/><line x1="800" y1="0" x2="800" y2="720" stroke-width="0.5"/><line x1="840" y1="0" x2="840" y2="720" stroke-width="0.5"/><line x1="880" y1="0" x2="880" y2="720" stroke-width="0.5"/><line x1="920" y1="0" x2="920" y2="720" stroke-width="0.5"/><line x1="960" y1="0" x2="960" y2="720" stroke-width="0.5"/> - <line x1="0" y1="0" x2="960" y2="0" stroke-width="0.5"/><line x1="0" y1="40" x2="960" y2="40" stroke-width="0.5"/><line x1="0" y1="80" x2="960" y2="80" stroke-width="0.5"/><line x1="0" y1="120" x2="960" y2="120" stroke-width="0.5"/><line x1="0" y1="160" x2="960" y2="160" stroke-width="0.5"/><line x1="0" y1="200" x2="960" y2="200" stroke-width="0.5"/><line x1="0" y1="240" x2="960" y2="240" stroke-width="0.5"/><line x1="0" y1="280" x2="960" y2="280" stroke-width="0.5"/><line x1="0" y1="320" x2="960" y2="320" stroke-width="0.5"/><line x1="0" y1="360" x2="960" y2="360" stroke-width="0.5"/><line x1="0" y1="400" x2="960" y2="400" stroke-width="0.5"/><line x1="0" y1="440" x2="960" y2="440" stroke-width="0.5"/><line x1="0" y1="480" x2="960" y2="480" stroke-width="0.5"/><line x1="0" y1="520" x2="960" y2="520" stroke-width="0.5"/><line x1="0" y1="560" x2="960" y2="560" stroke-width="0.5"/><line x1="0" y1="600" x2="960" y2="600" stroke-width="0.5"/><line x1="0" y1="640" x2="960" y2="640" stroke-width="0.5"/><line x1="0" y1="680" x2="960" y2="680" stroke-width="0.5"/><line x1="0" y1="720" x2="960" y2="720" stroke-width="0.5"/> - </g> - - <!-- Title --> - <text x="480" y="44" text-anchor="middle" fill="#E6EDF3" font-size="20" font-weight="700" letter-spacing="3">AUTONOMOUS SDLC SWARM</text> - <text x="480" y="66" text-anchor="middle" fill="#22C55E" font-size="12" font-weight="500" letter-spacing="5" opacity="0.8">CONTROL PLANE</text> - <line x1="320" y1="76" x2="640" y2="76" stroke="#22C55E" stroke-width="0.5" opacity="0.3"/> - - <!-- ==================== OUTER RING: Recovery Controller ==================== --> - <!-- Outer ring arc (partial circle) --> - <ellipse cx="480" cy="370" rx="380" ry="270" fill="none" stroke="#F59E0B" stroke-width="1" opacity="0.15" stroke-dasharray="4 6"/> - - <!-- Recovery Controller label --> - <text x="480" y="108" text-anchor="middle" fill="#F59E0B" font-size="10" font-weight="600" letter-spacing="4" opacity="0.6">RECOVERY CONTROLLER</text> - - <!-- Escalation states along top arc --> - <!-- NORMAL --> - <rect x="168" y="118" width="90" height="26" rx="4" fill="#0D1117" stroke="#10B981" stroke-width="1" opacity="0.9"/> - <text x="213" y="135" text-anchor="middle" fill="#10B981" font-size="9" font-weight="600" letter-spacing="1">NORMAL</text> - - <!-- Arrow NORMAL → ELEVATED --> - <line x1="261" y1="131" x2="315" y2="131" stroke="#F59E0B" stroke-width="1" opacity="0.5" marker-end="url(#state-arrow)"/> - - <!-- ELEVATED --> - <rect x="328" y="118" width="90" height="26" rx="4" fill="#0D1117" stroke="#F59E0B" stroke-width="1" opacity="0.9"/> - <text x="373" y="135" text-anchor="middle" fill="#F59E0B" font-size="9" font-weight="600" letter-spacing="1">ELEVATED</text> - - <!-- Arrow ELEVATED → HIGH --> - <line x1="421" y1="131" x2="495" y2="131" stroke="#F59E0B" stroke-width="1" opacity="0.5" marker-end="url(#state-arrow)"/> - - <!-- HIGH --> - <rect x="508" y="118" width="90" height="26" rx="4" fill="#0D1117" stroke="#F97316" stroke-width="1" opacity="0.9"/> - <text x="553" y="135" text-anchor="middle" fill="#F97316" font-size="9" font-weight="700" letter-spacing="1">HIGH</text> - - <!-- Arrow HIGH → LOCKDOWN --> - <line x1="601" y1="131" x2="655" y2="131" stroke="#F59E0B" stroke-width="1" opacity="0.5" marker-end="url(#state-arrow)"/> - - <!-- LOCKDOWN --> - <rect x="668" y="118" width="100" height="26" rx="4" fill="#0D1117" stroke="#EF4444" stroke-width="1.5" opacity="0.9"/> - <text x="718" y="135" text-anchor="middle" fill="#EF4444" font-size="9" font-weight="700" letter-spacing="1">LOCKDOWN</text> - - <!-- ==================== CENTER: Control Plane ==================== --> - <!-- Outer glow ring --> - <rect x="370" y="300" width="220" height="100" rx="8" fill="none" stroke="#22C55E" stroke-width="0.5" opacity="0.15" filter="url(#glow-strong)"/> - - <!-- Main box --> - <rect x="375" y="305" width="210" height="90" rx="6" fill="#0D1117" stroke="#22C55E" stroke-width="1.5" filter="url(#glow)"/> - <rect x="375" y="305" width="210" height="90" rx="6" fill="#22C55E" opacity="0.04"/> - - <!-- Inner accent line --> - <line x1="395" y1="340" x2="565" y2="340" stroke="#22C55E" stroke-width="0.5" opacity="0.3"/> - - <text x="480" y="332" text-anchor="middle" fill="#E6EDF3" font-size="15" font-weight="700" letter-spacing="2">CONTROL PLANE</text> - <text x="480" y="358" text-anchor="middle" fill="#8B949E" font-size="9" letter-spacing="1">GOVERNED ACTION KERNEL</text> - <text x="480" y="376" text-anchor="middle" fill="#484F58" font-size="8" letter-spacing="1">propose | evaluate | execute | emit</text> - - <!-- ==================== MODULE: Core (top) ==================== --> - <g filter="url(#glow-soft)"> - <rect x="405" y="168" width="150" height="70" rx="5" fill="#0D1117" stroke="#22C55E" stroke-width="1" opacity="0.9"/> - <rect x="405" y="168" width="150" height="70" rx="5" fill="#22C55E" opacity="0.03"/> - </g> - <text x="480" y="194" text-anchor="middle" fill="#22C55E" font-size="13" font-weight="700" letter-spacing="1">CORE</text> - <text x="480" y="212" text-anchor="middle" fill="#8B949E" font-size="8.5">coding, PRs, CI triage</text> - <text x="480" y="226" text-anchor="middle" fill="#484F58" font-size="7.5">7 agents</text> - <!-- Connector Core → Center --> - <line x1="480" y1="238" x2="480" y2="305" stroke="#22C55E" stroke-width="1" opacity="0.4" stroke-dasharray="3 3"/> - - <!-- ==================== MODULE: Governance (left) ==================== --> - <g filter="url(#glow-soft)"> - <rect x="120" y="290" width="170" height="70" rx="5" fill="#0D1117" stroke="#22C55E" stroke-width="1" opacity="0.9"/> - <rect x="120" y="290" width="170" height="70" rx="5" fill="#22C55E" opacity="0.03"/> - </g> - <text x="205" y="316" text-anchor="middle" fill="#22C55E" font-size="13" font-weight="700" letter-spacing="1">GOVERNANCE</text> - <text x="205" y="334" text-anchor="middle" fill="#8B949E" font-size="8.5">policy, invariants, audit</text> - <text x="205" y="348" text-anchor="middle" fill="#484F58" font-size="7.5">3 agents</text> - <!-- Connector Governance → Center --> - <line x1="290" y1="330" x2="375" y2="345" stroke="#22C55E" stroke-width="1" opacity="0.4" stroke-dasharray="3 3"/> - - <!-- ==================== MODULE: Ops (right) ==================== --> - <g filter="url(#glow-soft)"> - <rect x="670" y="290" width="170" height="70" rx="5" fill="#0D1117" stroke="#22C55E" stroke-width="1" opacity="0.9"/> - <rect x="670" y="290" width="170" height="70" rx="5" fill="#22C55E" opacity="0.03"/> - </g> - <text x="755" y="316" text-anchor="middle" fill="#22C55E" font-size="13" font-weight="700" letter-spacing="1">OPS</text> - <text x="755" y="334" text-anchor="middle" fill="#8B949E" font-size="8.5">planning, backlog, observability</text> - <text x="755" y="348" text-anchor="middle" fill="#484F58" font-size="7.5">8 agents</text> - <!-- Connector Ops → Center --> - <line x1="670" y1="330" x2="585" y2="345" stroke="#22C55E" stroke-width="1" opacity="0.4" stroke-dasharray="3 3"/> - - <!-- ==================== MODULE: Quality (bottom-left) ==================== --> - <g filter="url(#glow-soft)"> - <rect x="160" y="430" width="170" height="70" rx="5" fill="#0D1117" stroke="#22C55E" stroke-width="1" opacity="0.9"/> - <rect x="160" y="430" width="170" height="70" rx="5" fill="#22C55E" opacity="0.03"/> - </g> - <text x="245" y="456" text-anchor="middle" fill="#22C55E" font-size="13" font-weight="700" letter-spacing="1">QUALITY</text> - <text x="245" y="474" text-anchor="middle" fill="#8B949E" font-size="8.5">testing, security, architecture</text> - <text x="245" y="488" text-anchor="middle" fill="#484F58" font-size="7.5">7 agents</text> - <!-- Connector Quality → Center --> - <line x1="310" y1="445" x2="400" y2="395" stroke="#22C55E" stroke-width="1" opacity="0.4" stroke-dasharray="3 3"/> - - <!-- ==================== MODULE: Marketing (bottom-right) ==================== --> - <g filter="url(#glow-soft)"> - <rect x="620" y="430" width="180" height="70" rx="5" fill="#0D1117" stroke="#22C55E" stroke-width="1" opacity="0.9"/> - <rect x="620" y="430" width="180" height="70" rx="5" fill="#22C55E" opacity="0.03"/> - </g> - <text x="710" y="456" text-anchor="middle" fill="#22C55E" font-size="13" font-weight="700" letter-spacing="1">MARKETING</text> - <text x="710" y="474" text-anchor="middle" fill="#8B949E" font-size="8.5">changelogs, community</text> - <text x="710" y="488" text-anchor="middle" fill="#484F58" font-size="7.5">1 agent</text> - <!-- Connector Marketing → Center --> - <line x1="640" y1="445" x2="560" y2="395" stroke="#22C55E" stroke-width="1" opacity="0.4" stroke-dasharray="3 3"/> - - <!-- ==================== OUTPUT ARROWS ==================== --> - - <!-- Arrow to Git Repository --> - <line x1="200" y1="540" x2="200" y2="590" stroke="#22C55E" stroke-width="1.5" opacity="0.5" marker-end="url(#arrow)" filter="url(#glow-soft)"/> - <rect x="130" y="598" width="140" height="40" rx="4" fill="#0D1117" stroke="#30363D" stroke-width="1"/> - <text x="200" y="615" text-anchor="middle" fill="#8B949E" font-size="9" font-weight="600" letter-spacing="1">GIT REPOSITORY</text> - <text x="200" y="629" text-anchor="middle" fill="#484F58" font-size="7">branches, commits, PRs</text> - <!-- Connector from Quality down --> - <line x1="245" y1="500" x2="200" y2="540" stroke="#22C55E" stroke-width="1" opacity="0.3" stroke-dasharray="3 3"/> - - <!-- Arrow to CI/CD Pipeline --> - <line x1="480" y1="540" x2="480" y2="590" stroke="#22C55E" stroke-width="1.5" opacity="0.5" marker-end="url(#arrow)" filter="url(#glow-soft)"/> - <rect x="400" y="598" width="160" height="40" rx="4" fill="#0D1117" stroke="#30363D" stroke-width="1"/> - <text x="480" y="615" text-anchor="middle" fill="#8B949E" font-size="9" font-weight="600" letter-spacing="1">CI/CD PIPELINE</text> - <text x="480" y="629" text-anchor="middle" fill="#484F58" font-size="7">build, test, deploy</text> - <!-- Connector from Center down --> - <line x1="480" y1="395" x2="480" y2="540" stroke="#22C55E" stroke-width="1" opacity="0.3" stroke-dasharray="3 3"/> - - <!-- Arrow to Production --> - <line x1="755" y1="540" x2="755" y2="590" stroke="#22C55E" stroke-width="1.5" opacity="0.5" marker-end="url(#arrow)" filter="url(#glow-soft)"/> - <rect x="665" y="598" width="180" height="40" rx="4" fill="#0D1117" stroke="#30363D" stroke-width="1"/> - <text x="755" y="615" text-anchor="middle" fill="#8B949E" font-size="9" font-weight="600" letter-spacing="1">PRODUCTION INFRA</text> - <text x="755" y="629" text-anchor="middle" fill="#484F58" font-size="7">servers, containers, edge</text> - <!-- Connector from Ops down --> - <line x1="755" y1="500" x2="755" y2="540" stroke="#22C55E" stroke-width="1" opacity="0.3" stroke-dasharray="3 3"/> - - <!-- ==================== DECORATIVE ELEMENTS ==================== --> - - <!-- Corner accents --> - <polyline points="20,20 20,50" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="20,20 50,20" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="940,20 940,50" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="940,20 910,20" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="20,700 20,670" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="20,700 50,700" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="940,700 940,670" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="940,700 910,700" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - - <!-- Agent count summary --> - <text x="480" y="668" text-anchor="middle" fill="#484F58" font-size="8" letter-spacing="2">26 COORDINATED AGENTS | 5 TIERS | GOVERNED ACTION RUNTIME</text> - - <!-- Bottom branding --> - <text x="480" y="692" text-anchor="middle" fill="#30363D" font-size="7" letter-spacing="3">AGENTGUARD</text> -</svg> diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml index b8a4c56c..2e2a7a22 100644 --- a/pnpm-lock.yaml +++ b/pnpm-lock.yaml @@ -91,9 +91,6 @@ importers: '@red-codes/storage': specifier: workspace:* version: link:../../packages/storage - '@red-codes/swarm': - specifier: workspace:* - version: link:../../packages/swarm '@red-codes/telemetry': specifier: workspace:* version: link:../../packages/telemetry @@ -309,19 +306,6 @@ importers: specifier: ^7.6.0 version: 7.6.13 - packages/swarm: - dependencies: - yaml: - specifier: ^2.8.3 - version: 2.8.3 - devDependencies: - typescript: - specifier: ^5.8.3 - version: 5.9.3 - vitest: - specifier: ^4.1.0 - version: 4.1.1(@types/node@25.5.0)(vite@8.0.2(@emnapi/core@1.9.1)(@emnapi/runtime@1.9.1)(@types/node@25.5.0)(esbuild@0.27.4)(tsx@4.21.0)(yaml@2.8.3)) - packages/telemetry: dependencies: '@red-codes/core': diff --git a/site/assets/sdlc-swarm-control-plane.svg b/site/assets/sdlc-swarm-control-plane.svg deleted file mode 100644 index 97423f9c..00000000 --- a/site/assets/sdlc-swarm-control-plane.svg +++ /dev/null @@ -1,201 +0,0 @@ -<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 960 720" font-family="'Inter', 'Segoe UI', system-ui, -apple-system, sans-serif"> - <defs> - <!-- Neon glow filter --> - <filter id="glow" x="-20%" y="-20%" width="140%" height="140%"> - <feGaussianBlur stdDeviation="4" result="blur"/> - <feMerge> - <feMergeNode in="blur"/> - <feMergeNode in="SourceGraphic"/> - </feMerge> - </filter> - <!-- Subtle glow for center --> - <filter id="glow-strong" x="-30%" y="-30%" width="160%" height="160%"> - <feGaussianBlur stdDeviation="8" result="blur"/> - <feMerge> - <feMergeNode in="blur"/> - <feMergeNode in="SourceGraphic"/> - </feMerge> - </filter> - <!-- Soft glow for outer ring --> - <filter id="glow-soft" x="-10%" y="-10%" width="120%" height="120%"> - <feGaussianBlur stdDeviation="2" result="blur"/> - <feMerge> - <feMergeNode in="blur"/> - <feMergeNode in="SourceGraphic"/> - </feMerge> - </filter> - <!-- Arrow marker --> - <marker id="arrow" markerWidth="10" markerHeight="7" refX="10" refY="3.5" orient="auto" markerUnits="strokeWidth"> - <polygon points="0 0, 10 3.5, 0 7" fill="#22C55E" opacity="0.8"/> - </marker> - <!-- Dim arrow marker --> - <marker id="arrow-dim" markerWidth="10" markerHeight="7" refX="10" refY="3.5" orient="auto" markerUnits="strokeWidth"> - <polygon points="0 0, 10 3.5, 0 7" fill="#22C55E" opacity="0.4"/> - </marker> - <!-- State arrow --> - <marker id="state-arrow" markerWidth="8" markerHeight="6" refX="8" refY="3" orient="auto" markerUnits="strokeWidth"> - <polygon points="0 0, 8 3, 0 6" fill="#F59E0B" opacity="0.7"/> - </marker> - </defs> - - <!-- Background --> - <rect width="960" height="720" fill="#0D1117" rx="0"/> - - <!-- Subtle grid pattern --> - <g opacity="0.03" stroke="#22C55E"> - <line x1="0" y1="0" x2="0" y2="720" stroke-width="0.5"/><line x1="40" y1="0" x2="40" y2="720" stroke-width="0.5"/><line x1="80" y1="0" x2="80" y2="720" stroke-width="0.5"/><line x1="120" y1="0" x2="120" y2="720" stroke-width="0.5"/><line x1="160" y1="0" x2="160" y2="720" stroke-width="0.5"/><line x1="200" y1="0" x2="200" y2="720" stroke-width="0.5"/><line x1="240" y1="0" x2="240" y2="720" stroke-width="0.5"/><line x1="280" y1="0" x2="280" y2="720" stroke-width="0.5"/><line x1="320" y1="0" x2="320" y2="720" stroke-width="0.5"/><line x1="360" y1="0" x2="360" y2="720" stroke-width="0.5"/><line x1="400" y1="0" x2="400" y2="720" stroke-width="0.5"/><line x1="440" y1="0" x2="440" y2="720" stroke-width="0.5"/><line x1="480" y1="0" x2="480" y2="720" stroke-width="0.5"/><line x1="520" y1="0" x2="520" y2="720" stroke-width="0.5"/><line x1="560" y1="0" x2="560" y2="720" stroke-width="0.5"/><line x1="600" y1="0" x2="600" y2="720" stroke-width="0.5"/><line x1="640" y1="0" x2="640" y2="720" stroke-width="0.5"/><line x1="680" y1="0" x2="680" y2="720" stroke-width="0.5"/><line x1="720" y1="0" x2="720" y2="720" stroke-width="0.5"/><line x1="760" y1="0" x2="760" y2="720" stroke-width="0.5"/><line x1="800" y1="0" x2="800" y2="720" stroke-width="0.5"/><line x1="840" y1="0" x2="840" y2="720" stroke-width="0.5"/><line x1="880" y1="0" x2="880" y2="720" stroke-width="0.5"/><line x1="920" y1="0" x2="920" y2="720" stroke-width="0.5"/><line x1="960" y1="0" x2="960" y2="720" stroke-width="0.5"/> - <line x1="0" y1="0" x2="960" y2="0" stroke-width="0.5"/><line x1="0" y1="40" x2="960" y2="40" stroke-width="0.5"/><line x1="0" y1="80" x2="960" y2="80" stroke-width="0.5"/><line x1="0" y1="120" x2="960" y2="120" stroke-width="0.5"/><line x1="0" y1="160" x2="960" y2="160" stroke-width="0.5"/><line x1="0" y1="200" x2="960" y2="200" stroke-width="0.5"/><line x1="0" y1="240" x2="960" y2="240" stroke-width="0.5"/><line x1="0" y1="280" x2="960" y2="280" stroke-width="0.5"/><line x1="0" y1="320" x2="960" y2="320" stroke-width="0.5"/><line x1="0" y1="360" x2="960" y2="360" stroke-width="0.5"/><line x1="0" y1="400" x2="960" y2="400" stroke-width="0.5"/><line x1="0" y1="440" x2="960" y2="440" stroke-width="0.5"/><line x1="0" y1="480" x2="960" y2="480" stroke-width="0.5"/><line x1="0" y1="520" x2="960" y2="520" stroke-width="0.5"/><line x1="0" y1="560" x2="960" y2="560" stroke-width="0.5"/><line x1="0" y1="600" x2="960" y2="600" stroke-width="0.5"/><line x1="0" y1="640" x2="960" y2="640" stroke-width="0.5"/><line x1="0" y1="680" x2="960" y2="680" stroke-width="0.5"/><line x1="0" y1="720" x2="960" y2="720" stroke-width="0.5"/> - </g> - - <!-- Title --> - <text x="480" y="44" text-anchor="middle" fill="#E6EDF3" font-size="20" font-weight="700" letter-spacing="3">AUTONOMOUS SDLC SWARM</text> - <text x="480" y="66" text-anchor="middle" fill="#22C55E" font-size="12" font-weight="500" letter-spacing="5" opacity="0.8">CONTROL PLANE</text> - <line x1="320" y1="76" x2="640" y2="76" stroke="#22C55E" stroke-width="0.5" opacity="0.3"/> - - <!-- ==================== OUTER RING: Recovery Controller ==================== --> - <!-- Outer ring arc (partial circle) --> - <ellipse cx="480" cy="370" rx="380" ry="270" fill="none" stroke="#F59E0B" stroke-width="1" opacity="0.15" stroke-dasharray="4 6"/> - - <!-- Recovery Controller label --> - <text x="480" y="108" text-anchor="middle" fill="#F59E0B" font-size="10" font-weight="600" letter-spacing="4" opacity="0.6">RECOVERY CONTROLLER</text> - - <!-- Escalation states along top arc --> - <!-- NORMAL --> - <rect x="168" y="118" width="90" height="26" rx="4" fill="#0D1117" stroke="#10B981" stroke-width="1" opacity="0.9"/> - <text x="213" y="135" text-anchor="middle" fill="#10B981" font-size="9" font-weight="600" letter-spacing="1">NORMAL</text> - - <!-- Arrow NORMAL → ELEVATED --> - <line x1="261" y1="131" x2="315" y2="131" stroke="#F59E0B" stroke-width="1" opacity="0.5" marker-end="url(#state-arrow)"/> - - <!-- ELEVATED --> - <rect x="328" y="118" width="90" height="26" rx="4" fill="#0D1117" stroke="#F59E0B" stroke-width="1" opacity="0.9"/> - <text x="373" y="135" text-anchor="middle" fill="#F59E0B" font-size="9" font-weight="600" letter-spacing="1">ELEVATED</text> - - <!-- Arrow ELEVATED → HIGH --> - <line x1="421" y1="131" x2="495" y2="131" stroke="#F59E0B" stroke-width="1" opacity="0.5" marker-end="url(#state-arrow)"/> - - <!-- HIGH --> - <rect x="508" y="118" width="90" height="26" rx="4" fill="#0D1117" stroke="#F97316" stroke-width="1" opacity="0.9"/> - <text x="553" y="135" text-anchor="middle" fill="#F97316" font-size="9" font-weight="700" letter-spacing="1">HIGH</text> - - <!-- Arrow HIGH → LOCKDOWN --> - <line x1="601" y1="131" x2="655" y2="131" stroke="#F59E0B" stroke-width="1" opacity="0.5" marker-end="url(#state-arrow)"/> - - <!-- LOCKDOWN --> - <rect x="668" y="118" width="100" height="26" rx="4" fill="#0D1117" stroke="#EF4444" stroke-width="1.5" opacity="0.9"/> - <text x="718" y="135" text-anchor="middle" fill="#EF4444" font-size="9" font-weight="700" letter-spacing="1">LOCKDOWN</text> - - <!-- ==================== CENTER: Control Plane ==================== --> - <!-- Outer glow ring --> - <rect x="370" y="300" width="220" height="100" rx="8" fill="none" stroke="#22C55E" stroke-width="0.5" opacity="0.15" filter="url(#glow-strong)"/> - - <!-- Main box --> - <rect x="375" y="305" width="210" height="90" rx="6" fill="#0D1117" stroke="#22C55E" stroke-width="1.5" filter="url(#glow)"/> - <rect x="375" y="305" width="210" height="90" rx="6" fill="#22C55E" opacity="0.04"/> - - <!-- Inner accent line --> - <line x1="395" y1="340" x2="565" y2="340" stroke="#22C55E" stroke-width="0.5" opacity="0.3"/> - - <text x="480" y="332" text-anchor="middle" fill="#E6EDF3" font-size="15" font-weight="700" letter-spacing="2">CONTROL PLANE</text> - <text x="480" y="358" text-anchor="middle" fill="#8B949E" font-size="9" letter-spacing="1">GOVERNED ACTION KERNEL</text> - <text x="480" y="376" text-anchor="middle" fill="#484F58" font-size="8" letter-spacing="1">propose | evaluate | execute | emit</text> - - <!-- ==================== MODULE: Core (top) ==================== --> - <g filter="url(#glow-soft)"> - <rect x="405" y="168" width="150" height="70" rx="5" fill="#0D1117" stroke="#22C55E" stroke-width="1" opacity="0.9"/> - <rect x="405" y="168" width="150" height="70" rx="5" fill="#22C55E" opacity="0.03"/> - </g> - <text x="480" y="194" text-anchor="middle" fill="#22C55E" font-size="13" font-weight="700" letter-spacing="1">CORE</text> - <text x="480" y="212" text-anchor="middle" fill="#8B949E" font-size="8.5">coding, PRs, CI triage</text> - <text x="480" y="226" text-anchor="middle" fill="#484F58" font-size="7.5">7 agents</text> - <!-- Connector Core → Center --> - <line x1="480" y1="238" x2="480" y2="305" stroke="#22C55E" stroke-width="1" opacity="0.4" stroke-dasharray="3 3"/> - - <!-- ==================== MODULE: Governance (left) ==================== --> - <g filter="url(#glow-soft)"> - <rect x="120" y="290" width="170" height="70" rx="5" fill="#0D1117" stroke="#22C55E" stroke-width="1" opacity="0.9"/> - <rect x="120" y="290" width="170" height="70" rx="5" fill="#22C55E" opacity="0.03"/> - </g> - <text x="205" y="316" text-anchor="middle" fill="#22C55E" font-size="13" font-weight="700" letter-spacing="1">GOVERNANCE</text> - <text x="205" y="334" text-anchor="middle" fill="#8B949E" font-size="8.5">policy, invariants, audit</text> - <text x="205" y="348" text-anchor="middle" fill="#484F58" font-size="7.5">3 agents</text> - <!-- Connector Governance → Center --> - <line x1="290" y1="330" x2="375" y2="345" stroke="#22C55E" stroke-width="1" opacity="0.4" stroke-dasharray="3 3"/> - - <!-- ==================== MODULE: Ops (right) ==================== --> - <g filter="url(#glow-soft)"> - <rect x="670" y="290" width="170" height="70" rx="5" fill="#0D1117" stroke="#22C55E" stroke-width="1" opacity="0.9"/> - <rect x="670" y="290" width="170" height="70" rx="5" fill="#22C55E" opacity="0.03"/> - </g> - <text x="755" y="316" text-anchor="middle" fill="#22C55E" font-size="13" font-weight="700" letter-spacing="1">OPS</text> - <text x="755" y="334" text-anchor="middle" fill="#8B949E" font-size="8.5">planning, backlog, observability</text> - <text x="755" y="348" text-anchor="middle" fill="#484F58" font-size="7.5">8 agents</text> - <!-- Connector Ops → Center --> - <line x1="670" y1="330" x2="585" y2="345" stroke="#22C55E" stroke-width="1" opacity="0.4" stroke-dasharray="3 3"/> - - <!-- ==================== MODULE: Quality (bottom-left) ==================== --> - <g filter="url(#glow-soft)"> - <rect x="160" y="430" width="170" height="70" rx="5" fill="#0D1117" stroke="#22C55E" stroke-width="1" opacity="0.9"/> - <rect x="160" y="430" width="170" height="70" rx="5" fill="#22C55E" opacity="0.03"/> - </g> - <text x="245" y="456" text-anchor="middle" fill="#22C55E" font-size="13" font-weight="700" letter-spacing="1">QUALITY</text> - <text x="245" y="474" text-anchor="middle" fill="#8B949E" font-size="8.5">testing, security, architecture</text> - <text x="245" y="488" text-anchor="middle" fill="#484F58" font-size="7.5">7 agents</text> - <!-- Connector Quality → Center --> - <line x1="310" y1="445" x2="400" y2="395" stroke="#22C55E" stroke-width="1" opacity="0.4" stroke-dasharray="3 3"/> - - <!-- ==================== MODULE: Marketing (bottom-right) ==================== --> - <g filter="url(#glow-soft)"> - <rect x="620" y="430" width="180" height="70" rx="5" fill="#0D1117" stroke="#22C55E" stroke-width="1" opacity="0.9"/> - <rect x="620" y="430" width="180" height="70" rx="5" fill="#22C55E" opacity="0.03"/> - </g> - <text x="710" y="456" text-anchor="middle" fill="#22C55E" font-size="13" font-weight="700" letter-spacing="1">MARKETING</text> - <text x="710" y="474" text-anchor="middle" fill="#8B949E" font-size="8.5">changelogs, community</text> - <text x="710" y="488" text-anchor="middle" fill="#484F58" font-size="7.5">1 agent</text> - <!-- Connector Marketing → Center --> - <line x1="640" y1="445" x2="560" y2="395" stroke="#22C55E" stroke-width="1" opacity="0.4" stroke-dasharray="3 3"/> - - <!-- ==================== OUTPUT ARROWS ==================== --> - - <!-- Arrow to Git Repository --> - <line x1="200" y1="540" x2="200" y2="590" stroke="#22C55E" stroke-width="1.5" opacity="0.5" marker-end="url(#arrow)" filter="url(#glow-soft)"/> - <rect x="130" y="598" width="140" height="40" rx="4" fill="#0D1117" stroke="#30363D" stroke-width="1"/> - <text x="200" y="615" text-anchor="middle" fill="#8B949E" font-size="9" font-weight="600" letter-spacing="1">GIT REPOSITORY</text> - <text x="200" y="629" text-anchor="middle" fill="#484F58" font-size="7">branches, commits, PRs</text> - <!-- Connector from Quality down --> - <line x1="245" y1="500" x2="200" y2="540" stroke="#22C55E" stroke-width="1" opacity="0.3" stroke-dasharray="3 3"/> - - <!-- Arrow to CI/CD Pipeline --> - <line x1="480" y1="540" x2="480" y2="590" stroke="#22C55E" stroke-width="1.5" opacity="0.5" marker-end="url(#arrow)" filter="url(#glow-soft)"/> - <rect x="400" y="598" width="160" height="40" rx="4" fill="#0D1117" stroke="#30363D" stroke-width="1"/> - <text x="480" y="615" text-anchor="middle" fill="#8B949E" font-size="9" font-weight="600" letter-spacing="1">CI/CD PIPELINE</text> - <text x="480" y="629" text-anchor="middle" fill="#484F58" font-size="7">build, test, deploy</text> - <!-- Connector from Center down --> - <line x1="480" y1="395" x2="480" y2="540" stroke="#22C55E" stroke-width="1" opacity="0.3" stroke-dasharray="3 3"/> - - <!-- Arrow to Production --> - <line x1="755" y1="540" x2="755" y2="590" stroke="#22C55E" stroke-width="1.5" opacity="0.5" marker-end="url(#arrow)" filter="url(#glow-soft)"/> - <rect x="665" y="598" width="180" height="40" rx="4" fill="#0D1117" stroke="#30363D" stroke-width="1"/> - <text x="755" y="615" text-anchor="middle" fill="#8B949E" font-size="9" font-weight="600" letter-spacing="1">PRODUCTION INFRA</text> - <text x="755" y="629" text-anchor="middle" fill="#484F58" font-size="7">servers, containers, edge</text> - <!-- Connector from Ops down --> - <line x1="755" y1="500" x2="755" y2="540" stroke="#22C55E" stroke-width="1" opacity="0.3" stroke-dasharray="3 3"/> - - <!-- ==================== DECORATIVE ELEMENTS ==================== --> - - <!-- Corner accents --> - <polyline points="20,20 20,50" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="20,20 50,20" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="940,20 940,50" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="940,20 910,20" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="20,700 20,670" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="20,700 50,700" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="940,700 940,670" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - <polyline points="940,700 910,700" stroke="#22C55E" stroke-width="1" opacity="0.2"/> - - <!-- Agent count summary --> - <text x="480" y="668" text-anchor="middle" fill="#484F58" font-size="8" letter-spacing="2">26 COORDINATED AGENTS | 5 TIERS | GOVERNED ACTION RUNTIME</text> - - <!-- Bottom branding --> - <text x="480" y="692" text-anchor="middle" fill="#30363D" font-size="7" letter-spacing="3">AGENTGUARD</text> -</svg> diff --git a/site/index.html b/site/index.html index 86df2ec0..77aadbac 100644 --- a/site/index.html +++ b/site/index.html @@ -368,7 +368,6 @@ <a href="#architecture" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Architecture</a> <a href="#invariants" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Invariants</a> <a href="#cli" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">CLI</a> - <a href="#swarm" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Swarm</a> <a href="https://agentguard-cloud-dashboard.vercel.app" target="_blank" rel="noopener noreferrer" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Dashboard</a> <a href="shellforge.html" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">ShellForge</a> <a href="posts.html" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Newsletter</a> @@ -395,7 +394,6 @@ <a href="#architecture" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Architecture</a> <a href="#invariants" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Invariants</a> <a href="#cli" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">CLI</a> - <a href="#swarm" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Swarm</a> <a href="https://agentguard-cloud-dashboard.vercel.app" target="_blank" rel="noopener noreferrer" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Dashboard</a> <a href="shellforge.html" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">ShellForge</a> <a href="posts.html" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Newsletter</a> @@ -876,15 +874,6 @@ <h3 class="font-mono font-semibold text-text mb-2">Agents That Learn</h3> <p class="text-muted text-sm leading-relaxed">Educate mode captures governance lessons into persistent memory. Agents improve over time—junior agents level up through the governance training ladder without hard blocks slowing delivery.</p> </div> - <!-- Card 12: Squad Swarm --> - <div class="reveal bg-surface border border-surface-light rounded-xl p-6 card-lift cursor-pointer"> - <div class="w-10 h-10 rounded-lg bg-cta/10 flex items-center justify-center mb-4"> - <svg class="w-5 h-5 text-cta" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" aria-hidden="true"><path d="M17 21v-2a4 4 0 00-4-4H5a4 4 0 00-4 4v2"/><circle cx="9" cy="7" r="4"/><path d="M23 21v-2a4 4 0 00-3-3.87"/><path d="M16 3.13a4 4 0 010 7.75"/></svg> - </div> - <h3 class="font-mono font-semibold text-text mb-2">Squad-Based Swarm</h3> - <p class="text-muted text-sm leading-relaxed">Organize agents into governed squads with EM→Director reporting, loop guards, lease-based coordination, and budget-aware task dispatch. Scale from 3 to 1000 agents with discipline.</p> - </div> - <!-- Card 10: Pre-Push Hooks --> <div class="reveal bg-surface border border-surface-light rounded-xl p-6 card-lift cursor-pointer"> <div class="w-10 h-10 rounded-lg bg-warning/10 flex items-center justify-center mb-4"> @@ -1678,164 +1667,6 @@ <h3 class="font-mono font-semibold text-muted text-xs uppercase tracking-wider m </div> </section> - <!-- ============================================ --> - <!-- AGENT SWARM --> - <!-- ============================================ --> - <section id="swarm" class="py-24 bg-surface/30" aria-labelledby="swarm-heading"> - <div class="max-w-7xl mx-auto px-6"> - <div class="text-center mb-6 reveal"> - <h2 id="swarm-heading" class="font-mono font-bold text-3xl sm:text-4xl mb-4">Agent Swarm</h2> - <p class="text-muted text-lg max-w-2xl mx-auto">Deploy a coordinated team of governed AI agents. 139 agents across 5 governed drivers, each with dedicated skills, schedules, and governance policies.</p> - </div> - - <!-- Stats --> - <div class="flex justify-center gap-8 mb-14 reveal"> - <div class="text-center"> - <span class="block font-mono font-bold text-2xl text-cta">139</span> - <span class="text-muted text-xs uppercase tracking-wider">Agents</span> - </div> - <div class="text-center"> - <span class="block font-mono font-bold text-2xl text-cta">5</span> - <span class="text-muted text-xs uppercase tracking-wider">Drivers</span> - </div> - <div class="text-center"> - <span class="block font-mono font-bold text-2xl text-cta">60+</span> - <span class="text-muted text-xs uppercase tracking-wider">Skills</span> - </div> - <div class="text-center"> - <span class="block font-mono font-bold text-2xl text-cta">24</span> - <span class="text-muted text-xs uppercase tracking-wider">Prompts</span> - </div> - </div> - - <!-- Architecture Diagram --> - <div class="mb-14 reveal"> - <div class="max-w-4xl mx-auto bg-surface border border-surface-light rounded-xl p-4 sm:p-6"> - <img src="assets/sdlc-swarm-control-plane.svg" alt="Autonomous SDLC Swarm — Control Plane architecture diagram showing governed AI agents across multiple drivers connected to a central control plane with recovery controller escalation states" class="w-full h-auto rounded-lg" loading="lazy"/> - </div> - </div> - - <!-- Tier cards --> - <div class="grid sm:grid-cols-2 lg:grid-cols-3 gap-4 mb-12 reveal-stagger"> - <!-- Core --> - <div class="reveal bg-surface border border-surface-light rounded-xl p-5"> - <div class="flex items-center gap-2 mb-3"> - <span class="w-2 h-2 rounded-full bg-cta"></span> - <h3 class="font-mono font-semibold text-cta text-xs uppercase tracking-wider">Core</h3> - <span class="ml-auto text-muted text-xs font-mono">7 agents</span> - </div> - <ul class="space-y-2 text-sm font-mono"> - <li class="text-muted"><span class="text-text">Implementation</span> — Picks issues, codes, opens PRs</li> - <li class="text-muted"><span class="text-text">Code Review</span> — Reviews open PRs</li> - <li class="text-muted"><span class="text-text">PR Merger</span> — Auto-merges approved PRs</li> - <li class="text-muted"><span class="text-text">CI Triage</span> — Diagnoses failing CI</li> - <li class="text-muted"><span class="text-text">Conflict Resolver</span> — Fixes merge conflicts</li> - <li class="text-muted"><span class="text-text">PR Responder</span> — Addresses review comments</li> - <li class="text-muted"><span class="text-text">Branch Janitor</span> — Cleans stale branches</li> - </ul> - </div> - - <!-- Governance --> - <div class="reveal bg-surface border border-surface-light rounded-xl p-5"> - <div class="flex items-center gap-2 mb-3"> - <span class="w-2 h-2 rounded-full bg-danger"></span> - <h3 class="font-mono font-semibold text-danger text-xs uppercase tracking-wider">Governance</h3> - <span class="ml-auto text-muted text-xs font-mono">3 agents</span> - </div> - <ul class="space-y-2 text-sm font-mono"> - <li class="text-muted"><span class="text-text">Risk Escalation</span> — Assesses cumulative risk</li> - <li class="text-muted"><span class="text-text">Recovery Controller</span> — Self-healing playbooks</li> - <li class="text-muted"><span class="text-text">Governance Monitor</span> — Audits governance logs</li> - </ul> - </div> - - <!-- Ops --> - <div class="reveal bg-surface border border-surface-light rounded-xl p-5"> - <div class="flex items-center gap-2 mb-3"> - <span class="w-2 h-2 rounded-full bg-warning"></span> - <h3 class="font-mono font-semibold text-warning text-xs uppercase tracking-wider">Ops</h3> - <span class="ml-auto text-muted text-xs font-mono">8 agents</span> - </div> - <ul class="space-y-2 text-sm font-mono"> - <li class="text-muted"><span class="text-text">Planning</span> — Sprint planning from backlog</li> - <li class="text-muted"><span class="text-text">Observability</span> — Monitors runtime health</li> - <li class="text-muted"><span class="text-text">Backlog Groomer</span> — Triages stale issues</li> - <li class="text-muted"><span class="text-text">Docs</span> — Keeps docs in sync</li> - <li class="text-muted"><span class="text-text">Product Health</span> — Dependency & coverage checks</li> - <li class="text-muted"><span class="text-text">Progress Controller</span> — Tracks sprint velocity</li> - <li class="text-muted"><span class="text-text">Repo Hygiene</span> — Labels, templates, configs</li> - <li class="text-muted"><span class="text-text">Retrospective</span> — Weekly retrospectives</li> - </ul> - </div> - - <!-- Quality --> - <div class="reveal bg-surface border border-surface-light rounded-xl p-5"> - <div class="flex items-center gap-2 mb-3"> - <span class="w-2 h-2 rounded-full bg-info"></span> - <h3 class="font-mono font-semibold text-info text-xs uppercase tracking-wider">Quality</h3> - <span class="ml-auto text-muted text-xs font-mono">7 agents</span> - </div> - <ul class="space-y-2 text-sm font-mono"> - <li class="text-muted"><span class="text-text">Testing</span> — Runs & expands test suites</li> - <li class="text-muted"><span class="text-text">Test Generation</span> — Generates new tests</li> - <li class="text-muted"><span class="text-text">Security Audit</span> — Scans for vulnerabilities</li> - <li class="text-muted"><span class="text-text">Architecture Review</span> — Detects drift & debt</li> - <li class="text-muted"><span class="text-text">CI/CD Hardener</span> — Strengthens pipelines</li> - <li class="text-muted"><span class="text-text">PR Audit</span> — Audits merged PRs</li> - <li class="text-muted"><span class="text-text">Infra Health</span> — Infrastructure checks</li> - </ul> - </div> - - <!-- Marketing --> - <div class="reveal bg-surface border border-surface-light rounded-xl p-5"> - <div class="flex items-center gap-2 mb-3"> - <span class="w-2 h-2 rounded-full bg-surface-light"></span> - <h3 class="font-mono font-semibold text-muted text-xs uppercase tracking-wider">Marketing</h3> - <span class="ml-auto text-muted text-xs font-mono">1 agent</span> - </div> - <ul class="space-y-2 text-sm font-mono"> - <li class="text-muted"><span class="text-text">Content Agent</span> — Changelogs, social posts & community engagement from project activity</li> - </ul> - </div> - - <!-- Quick Start card --> - <div class="reveal bg-surface border border-cta/20 rounded-xl p-5"> - <div class="flex items-center gap-2 mb-3"> - <span class="w-2 h-2 rounded-full bg-cta"></span> - <h3 class="font-mono font-semibold text-cta text-xs uppercase tracking-wider">Quick Start</h3> - </div> - <div class="space-y-3"> - <div> - <p class="text-muted text-xs mb-1">Scaffold all tiers:</p> - <pre class="bg-bg rounded-lg px-3 py-2 text-sm font-mono text-text overflow-x-auto"><code>agentguard init swarm</code></pre> - </div> - <div> - <p class="text-muted text-xs mb-1">Select specific tiers:</p> - <pre class="bg-bg rounded-lg px-3 py-2 text-sm font-mono text-text overflow-x-auto"><code>agentguard init swarm \ - --tiers core,governance</code></pre> - </div> - <p class="text-muted text-xs mt-3">Install with <span class="text-text font-mono">npm install -g @red-codes/agentguard</span>. Generates <span class="text-text font-mono">.claude/skills/</span>, agent prompts, and <span class="text-text font-mono">agentguard-swarm.yaml</span> config.</p> - </div> - </div> - - <!-- Post-Setup card --> - <div class="reveal bg-surface border border-border rounded-xl p-5 mt-4"> - <div class="flex items-center gap-2 mb-3"> - <span class="w-2 h-2 rounded-full bg-yellow-400"></span> - <h3 class="font-mono font-semibold text-yellow-400 text-xs uppercase tracking-wider">Post-Setup — Required</h3> - </div> - <p class="text-muted text-sm mb-3">Scaffolding generates the files, but agents still need to be configured in <strong class="text-text">Claude Desktop</strong> before they can run:</p> - <ol class="text-muted text-sm space-y-2 list-decimal list-inside"> - <li><strong class="text-text">Create scheduled tasks</strong> — Register each agent as a scheduled task in Claude Desktop using the cron schedules from the agent manifest.</li> - <li><strong class="text-text">Set <span class="font-mono text-cta">worktree: true</span></strong> — Enable worktree isolation so parallel agents get their own git worktree and don't conflict on file writes or git state.</li> - <li><strong class="text-text">Bypass permissions manually</strong> — Agents run unattended with no human to approve prompts. Pre-approve the required tool permissions (file, shell, git) in Claude Desktop for each agent.</li> - </ol> - <p class="text-muted text-xs mt-3">AgentGuard's governance policy + invariant system enforces safety boundaries even with permissions bypassed.</p> - </div> - </div> - </div> - </section> - <!-- ============================================ --> <!-- INTEGRATIONS --> <!-- ============================================ --> @@ -2194,7 +2025,6 @@ <h3 class="font-mono font-semibold text-text text-sm mb-4">Project</h3> <li><a href="#architecture" class="text-muted hover:text-text transition-colors text-sm cursor-pointer">Architecture</a></li> <li><a href="#invariants" class="text-muted hover:text-text transition-colors text-sm cursor-pointer">Invariants</a></li> <li><a href="#cli" class="text-muted hover:text-text transition-colors text-sm cursor-pointer">CLI Commands</a></li> - <li><a href="#swarm" class="text-muted hover:text-text transition-colors text-sm cursor-pointer">Agent Swarm</a></li> <li><a href="shellforge.html" class="text-muted hover:text-text transition-colors text-sm cursor-pointer">ShellForge</a></li> <li><a href="https://www.npmjs.com/package/@red-codes/agentguard" target="_blank" rel="noopener noreferrer" class="text-muted hover:text-text transition-colors text-sm cursor-pointer">@red-codes/agentguard</a></li> <li><a href="https://www.npmjs.com/package/@red-codes/core" target="_blank" rel="noopener noreferrer" class="text-muted hover:text-text transition-colors text-sm cursor-pointer">@red-codes/core</a></li> diff --git a/site/posts.html b/site/posts.html index 6ff27c45..4c322df4 100644 --- a/site/posts.html +++ b/site/posts.html @@ -153,7 +153,6 @@ <a href="index.html#architecture" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Architecture</a> <a href="index.html#invariants" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Invariants</a> <a href="index.html#cli" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">CLI</a> - <a href="index.html#swarm" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Swarm</a> <a href="posts.html" class="text-cta text-sm font-medium cursor-pointer">Newsletter</a> <a href="https://github.com/AgentGuardHQ/agentguard" target="_blank" rel="noopener noreferrer" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">GitHub</a> <button id="theme-toggle-desktop" class="theme-toggle text-muted hover:text-text" aria-label="Toggle light/dark mode"> @@ -176,7 +175,6 @@ <a href="index.html#architecture" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Architecture</a> <a href="index.html#invariants" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Invariants</a> <a href="index.html#cli" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">CLI</a> - <a href="index.html#swarm" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">Swarm</a> <a href="posts.html" class="text-cta text-sm font-medium cursor-pointer">Newsletter</a> <a href="https://github.com/AgentGuardHQ/agentguard" target="_blank" rel="noopener noreferrer" class="text-muted hover:text-text transition-colors text-sm font-medium cursor-pointer">GitHub</a> <a href="index.html#quickstart" class="bg-cta hover:bg-cta-dark text-bg font-semibold text-sm px-4 py-2 rounded-lg transition-colors text-center cursor-pointer">Get Started</a> diff --git a/site/shellforge.html b/site/shellforge.html index 25e40ad3..2d45a2ff 100644 --- a/site/shellforge.html +++ b/site/shellforge.html @@ -127,7 +127,6 @@ <a href="#quickstart" class="text-muted hover:text-text transition-colors text-sm font-medium">Quick Start</a> <a href="#stack" class="text-muted hover:text-text transition-colors text-sm font-medium">Stack</a> <a href="#governance" class="text-muted hover:text-text transition-colors text-sm font-medium">Governance</a> - <a href="#swarm" class="text-muted hover:text-text transition-colors text-sm font-medium">Swarm</a> <a href="index.html" class="text-muted hover:text-text transition-colors text-sm font-medium">AgentGuard</a> <a href="https://github.com/AgentGuardHQ/shellforge" target="_blank" rel="noopener noreferrer" class="text-muted hover:text-text transition-colors text-sm font-medium">GitHub</a> <button id="theme-toggle" class="theme-toggle text-muted hover:text-text" aria-label="Toggle light/dark mode"> @@ -505,93 +504,6 @@ <h2 id="governance-heading" class="font-mono font-bold text-3xl sm:text-4xl mb-6 </div> </section> - <!-- ============================================ --> - <!-- SWARM MODE --> - <!-- ============================================ --> - <section id="swarm" class="py-24 bg-surface/30" aria-labelledby="swarm-heading"> - <div class="max-w-7xl mx-auto px-6"> - <div class="text-center mb-16 reveal"> - <span class="inline-block font-mono text-bg text-xs font-bold tracking-wider uppercase bg-orange px-3 py-1 rounded-full mb-4">Swarm Mode</span> - <h2 id="swarm-heading" class="font-mono font-bold text-3xl sm:text-4xl mb-4">Run a 24/7 governed agent swarm on your Mac</h2> - <p class="text-muted text-lg max-w-2xl mx-auto">Memory-aware scheduling automatically computes how many agents can run in parallel based on your available RAM. Queue the rest.</p> - </div> - - <div class="grid lg:grid-cols-2 gap-12 items-start"> - <!-- Config example --> - <div class="reveal"> - <div class="bg-surface rounded-xl border border-surface-light overflow-hidden"> - <div class="flex items-center gap-2 px-4 py-3 bg-surface-light/50 border-b border-surface-light"> - <span class="font-mono text-xs text-muted">agents.yaml</span> - </div> - <pre class="p-5 font-mono text-xs leading-relaxed text-muted overflow-x-auto"><code><span class="text-cta">max_parallel</span>: 0 <span class="text-surface-light"># 0 = auto from RAM</span> -<span class="text-cta">model_ram_gb</span>: 19 <span class="text-surface-light"># qwen3:30b Q4</span> - -<span class="text-cta">agents</span>: - - <span class="text-cta">name</span>: qa-agent - <span class="text-cta">system</span>: <span class="text-text">"You are a QA engineer."</span> - <span class="text-cta">prompt</span>: <span class="text-text">"Find test gaps."</span> - <span class="text-cta">schedule</span>: <span class="text-text">"4h"</span> - <span class="text-cta">priority</span>: 2 - <span class="text-cta">timeout</span>: 300 - - - <span class="text-cta">name</span>: security-agent - <span class="text-cta">system</span>: <span class="text-text">"You are a security reviewer."</span> - <span class="text-cta">prompt</span>: <span class="text-text">"Audit for vulnerabilities."</span> - <span class="text-cta">schedule</span>: <span class="text-text">"24h"</span> - <span class="text-cta">priority</span>: 1 - <span class="text-cta">timeout</span>: 600</code></pre> - </div> - <div class="mt-4 font-mono text-sm"> - <div class="text-muted"># Start the swarm</div> - <div><span class="text-orange">$</span> <span class="text-text">shellforge serve agents.yaml</span></div> - </div> - </div> - - <!-- Memory table + multi-driver --> - <div class="reveal space-y-6"> - <div class="bg-surface border border-surface-light rounded-xl p-6"> - <h3 class="font-mono font-semibold text-text mb-4">Memory Budget — qwen3:30b Q4</h3> - <div class="overflow-x-auto"> - <table class="w-full text-sm font-mono"> - <thead> - <tr class="border-b border-surface-light"> - <th class="text-left text-muted font-medium pb-2">Mac</th> - <th class="text-left text-muted font-medium pb-2">RAM</th> - <th class="text-left text-muted font-medium pb-2">Max Parallel</th> - </tr> - </thead> - <tbody class="space-y-1"> - <tr class="border-b border-surface-light/50"> - <td class="py-2 text-text">M4 Pro 48GB</td> - <td class="py-2 text-muted">48 GB</td> - <td class="py-2 text-cta">3–4 agents</td> - </tr> - <tr class="border-b border-surface-light/50"> - <td class="py-2 text-text">M4 32GB</td> - <td class="py-2 text-muted">32 GB</td> - <td class="py-2 text-cta">1–2 agents</td> - </tr> - </tbody> - </table> - </div> - <p class="text-muted text-xs mt-3"><code class="text-cta">OLLAMA_KV_CACHE_TYPE=q8_0</code> halves KV cache memory — doubles agent capacity.</p> - </div> - - <div class="bg-surface border border-surface-light rounded-xl p-6"> - <h3 class="font-mono font-semibold text-text mb-3">Multi-Driver in One DAG</h3> - <p class="text-muted text-sm mb-4">Orchestrate multiple agent drivers in a single Dagu pipeline. Each driver runs with full governance.</p> - <div class="font-mono text-xs space-y-1"> - <div><span class="text-orange">$</span> <span class="text-text">shellforge run claude "review this code"</span></div> - <div><span class="text-orange">$</span> <span class="text-text">shellforge run codex "generate tests"</span></div> - <div><span class="text-orange">$</span> <span class="text-text">shellforge run gemini "security audit"</span></div> - <div class="text-muted"># or compose in dags/multi-driver-swarm.yaml</div> - </div> - </div> - </div> - </div> - </div> - </section> - <!-- ============================================ --> <!-- MODEL OPTIONS --> <!-- ============================================ --> @@ -684,7 +596,6 @@ <h3 class="font-mono font-semibold text-text text-sm mb-3">ShellForge</h3> <li><a href="#quickstart" class="text-muted hover:text-text transition-colors">Quick Start</a></li> <li><a href="#stack" class="text-muted hover:text-text transition-colors">The Stack</a></li> <li><a href="#governance" class="text-muted hover:text-text transition-colors">Governance</a></li> - <li><a href="#swarm" class="text-muted hover:text-text transition-colors">Swarm Mode</a></li> <li><a href="https://github.com/AgentGuardHQ/shellforge" target="_blank" rel="noopener noreferrer" class="text-muted hover:text-text transition-colors">GitHub</a></li> </ul> </div> diff --git a/tsconfig.json b/tsconfig.json index 1de9b8d1..b5ab1414 100644 --- a/tsconfig.json +++ b/tsconfig.json @@ -40,9 +40,6 @@ { "path": "packages/matchers" }, - { - "path": "packages/swarm" - }, { "path": "packages/sdk" },