Skip to content

feat(ambient): add agent orchestration to ambient mode#149

Merged
dean0x merged 3 commits intomainfrom
feat/ambient-orchestration
Mar 19, 2026
Merged

feat(ambient): add agent orchestration to ambient mode#149
dean0x merged 3 commits intomainfrom
feat/ambient-orchestration

Conversation

@dean0x
Copy link
Owner

@dean0x dean0x commented Mar 19, 2026

Summary

  • Adds ORCHESTRATED depth tier to ambient mode — spawns agent pipelines for complex multi-file tasks
  • Three-tier system: QUICK (zero overhead), GUIDED (skills + main session), ORCHESTRATED (skills + agent orchestration)
  • Three new orchestration skills: implementation-orchestration, debug-orchestration, plan-orchestration
  • Removes redundant /ambient slash command — ambient mode is hook-only via devflow ambient --enable
  • Trims hook preamble to ~30 words with git keyword fast-path for zero-overhead git ops

What Changed

New skills (3):

  • implementation-orchestration — Lightweight /implement pipeline: pre-flight → Coder → Validator → Simplifier → Scrutinizer → Shepherd
  • debug-orchestration — Lightweight /debug pipeline: hypotheses → parallel Explores → convergence → report → offer fix (GUIDED, not orchestrated)
  • plan-orchestration — Lightweight Plan pipeline: Skimmer → Explores → Plan agent → gap validation

Modified:

  • ambient-router SKILL.md — Three-tier depth classification, ORCHESTRATED skill selection, Step 5 agent orchestration
  • ambient-prompt hook — Trimmed preamble, git keyword fast-path
  • plugin.json — Added orchestration skills + agents to manifest
  • skill-catalog.md — Full rewrite with GUIDED/ORCHESTRATED depth columns
  • Test helpers — Extracted CLASSIFICATION_PATTERN constant, fixed extractDepth to use capture group 2
  • Integration tests — Fixed zero-assertion tests, added ORCHESTRATED classification test

Removed:

  • /ambient command (ambient.md) — hook handles everything

Behavioral Changes

  • EXPLORE intent always classifies as QUICK (was split QUICK/GUIDED)
  • Simple text edits ("Update the README") classify as QUICK (was BUILD/GUIDED)
  • Debug agent budget cap removed — agents scale to investigation needs
  • BUILD intent renamed to IMPLEMENT

Test Plan

  • npm run build — 35 skills, 17 plugins, all assets distributed
  • npm test — 250/250 tests pass
  • npm run test:integration — Integration tests (requires claude CLI + ambient enabled)
  • Manual: devflow ambient --status works
  • Manual: git operations get zero overhead (fast-path)
  • Manual: "add a login form" triggers IMPLEMENT/GUIDED with TDD
  • Manual: "refactor auth across API, DB, and frontend" triggers IMPLEMENT/ORCHESTRATED

Replace passive skill-only ambient mode with full agent orchestration.
Two depth tiers: QUICK (zero overhead) and ORCHESTRATED (skills + agent
pipelines). Three new orchestration skills drive intent-specific pipelines:

- implementation-orchestration: pre-flight → Coder → quality gates
- debug-orchestration: competing hypotheses → parallel Explores → root cause
- plan-orchestration: Skimmer → Explores → Plan agent → gap validation

Key changes:
- Remove ELEVATE tier, replace GUIDED with ORCHESTRATED
- Rename BUILD intent to IMPLEMENT for clarity
- Skills loaded via Skill tool instead of Read (fixes broken loading)
- Add TDD skill to Coder agent frontmatter permanently
- Ambient plugin now includes 7 agents + 4 skills
- Update ambient-prompt hook preamble for new tiers
- Classification conservatism: default to QUICK

Closes #84 (superseded by Skill tool approach).
@dean0x
Copy link
Owner Author

dean0x commented Mar 19, 2026

Code Review Summary: Ambient Mode Agent Orchestration

PR: #149
Branch: feat/ambient-orchestration → main
Review Date: 2026-03-19
Reviewers: 9 comprehensive scans (Security, Architecture, Performance, Complexity, Consistency, Regression, Tests, TypeScript, Documentation)


Blocking Issues (Must Address Before Merge)

Critical: Stale Integration Tests

Severity: CRITICAL | Confidence: 100% (3 reviewers flagged)
Files: /tests/integration/helpers.ts, /tests/integration/ambient-activation.test.ts

The integration test helpers use regex patterns matching the old taxonomy (BUILD|GUIDED|ELEVATE) but the PR renames these to IMPLEMENT|ORCHESTRATED. Tests will silently pass without validating behavior.

Required Fix:

  • Update helpers.ts line 40 regex: /(IMPLEMENT|DEBUG|REVIEW|PLAN|EXPLORE|CHAT)\s*\/\s*(QUICK|ORCHESTRATED)/
  • Update helpers.ts line 55 regex: /(IMPLEMENT|DEBUG|REVIEW|PLAN|EXPLORE|CHAT)/
  • Update helpers.ts line 63 regex: /\s*(QUICK|ORCHESTRATED)/
  • Update ambient-activation.test.ts lines 36-54 to expect IMPLEMENT and ORCHESTRATED

High: Plugin README Not Updated

Severity: HIGH | Confidence: 100% (2 reviewers flagged)
File: /plugins/devflow-ambient/README.md

README still documents the removed 3-tier model (QUICK/GUIDED/ELEVATE), old intent names (BUILD), and claims "no agents spawned." Users will see incorrect information about how the plugin works.

Required Fix: Rewrite README to document:

  • Two-tier model: QUICK (zero overhead) | ORCHESTRATED (skills + agent pipelines)
  • Intent names: IMPLEMENT, DEBUG, PLAN, REVIEW, EXPLORE, CHAT
  • New orchestration skills: implementation-orchestration, debug-orchestration, plan-orchestration
  • Agent pipelines: Coder + quality gates, competing hypothesis investigation, design-focused exploration

High: Missing Agent/Skill Declarations

Severity: HIGH | Confidence: 100% (2 reviewers flagged)
Files: /plugins/devflow-ambient/.claude-plugin/plugin.json, /shared/skills/

Orchestration skills reference "Explore agents" and "Plan agents" in phases, but these are not declared in plugin.json agents array. Also missing: synthesizer agent (used by debug-orchestration Phase 5 for convergence). The ambient plugin also lacks a git agent despite implementation-orchestration Phase 1 performing branch safety checks.

Required Fix:

  • Audit plugin.json agents list against what orchestration skills actually spawn
  • Either add missing agents to manifest or document that Explore/Plan are ephemeral Task sub-agents
  • Consider adding synthesizer for convergence (or clarify that convergence is inline)

High: search-first Skill Dropped

Severity: HIGH | Confidence: 100% (2 reviewers flagged)
Files: /shared/skills/ambient-router/SKILL.md:55, /shared/skills/ambient-router/references/skill-catalog.md

search-first was "Always for BUILD intent" in main branch but is completely removed from IMPLEMENT skill selection with no migration note. This skill enforces "research-before-building" and was added as a deliberate quality gate in v1.5.0 (#111).

Required Fix:

  • Either restore search-first to IMPLEMENT skill selection, OR
  • Document the intentional removal with rationale (e.g., "redundant when Coder has core-patterns skill")

High-Priority Should-Fix Issues

High: Security - Missing try/catch on JSON.parse

Severity: HIGH | Confidence: 95% (Security reviewer flagged)
File: /src/cli/commands/ambient.ts:33,70,95

Exported utility functions call JSON.parse() without try/catch. Corrupted settings.json would crash the CLI.

Recommended Fix: Wrap JSON.parse in try/catch with user-friendly error message


High: Performance - No Scope Gate for Lightweight IMPLEMENT Tasks

Severity: HIGH | Confidence: 95% (Performance reviewer flagged)
File: /shared/skills/implementation-orchestration/SKILL.md:66-76

All IMPLEMENT prompts trigger the full 5-6 agent pipeline regardless of scope. A one-line change and a multi-file refactor pay the same overhead.

Recommended Fix: Add Phase 2.5 scope gate — if plan affects ≤1 file & ≤20 lines, run LIGHT pipeline (Coder + Validator only)


High: Performance - Unbounded DEBUG Agent Budget

Severity: HIGH | Confidence: 95% (Performance reviewer flagged)
File: /shared/skills/debug-orchestration/SKILL.md:40-46

DEBUG can spawn up to 10 Explore agents (5 initial hypotheses + 2 validation + 3 second-round) with no explicit cap.

Recommended Fix: Add explicit agent budget cap: max 8 total agents (5 initial + 2 validation + 2 second-round)


Summary by Reviewer Dimension

Dimension Score Issues Key Concern
Security 8/10 1 CRITICAL, 3 HIGH, 1 Should-Fix Bash tool usage scope needs documentation
Architecture 7/10 3 HIGH, 2 MEDIUM Missing agent declarations; undocumented delta vs /implement//debug
Performance 7/10 2 HIGH, 3 MEDIUM No lightweight path for trivial IMPLEMENT; unbounded DEBUG budget
Complexity 7/10 2 HIGH, 3 MEDIUM Pipeline tables duplicated 3 places; orchestration skills crosstalk
Consistency 5/10 2 HIGH, 2 MEDIUM Task tool unprecedented; search-first dropped; 3→2 tier collapse undocumented
Regression 3/10 1 CRITICAL, 3 HIGH Tests broken; README stale; 5 intentional behavioral changes undocumented
Tests 3/10 1 CRITICAL, 1 HIGH Zero tests for new orchestration skills; integration tests have wrong taxonomy
TypeScript 8/10 0 CRITICAL, 0 HIGH One should-fix on untyped JSON.parse; passes strict compilation
Documentation 5/10 2 HIGH, 3 MEDIUM Plugin README not updated; integration tests stale; no CHANGELOG entry

Pre-existing Issues (Not Blocking)

Should Consider in Follow-up:

  • Settings file written without atomic write (fs.writeFile not atomic)
  • Coder agent skill frontmatter growing without bounds (7 skills, may add domain skills)
  • No shared Explore/Plan agent definitions (4+ commands define inline, inconsistent behavior)
  • Classification conservatism relies on LLM compliance, no mechanical enforcement
  • CLAUDE.md mentions "3-tier" system but orchestration skills don't fit existing tiers

Commit Message Note

The commit message states "add agent orchestration to ambient mode" which implies an additive change. In reality, this PR:

  1. Renames intent BUILD → IMPLEMENT
  2. Removes two depth tiers (GUIDED/ELEVATE) → single ORCHESTRATED
  3. Removes two skills from IMPLEMENT (search-first, test-driven-development)
  4. Changes enforcement model (main-session skills → agent orchestration)
  5. Downgrades analytical EXPLORE (QUICK/GUIDED split → always QUICK)

Consider updating the commit message to reflect the scope: refactor(ambient): orchestrate agents with taxonomy rename and tier collapse


Recommendation

CHANGES_REQUESTED

This PR is a significant architectural improvement for ambient mode, but requires fixes in 4 blocking areas (integration tests, plugin README, agent declarations, search-first skill) before merge. All blocking items are documentation/manifest corrections, not design changes.

Unblock Sequence:

  1. Fix integration test regex + assertions (15 min)
  2. Rewrite plugin README (20 min)
  3. Audit & add missing agents/skills to plugin.json (10 min)
  4. Decide on search-first restoration or removal documentation (5 min)

High-priority should-fix items (performance scope gate, debug budget cap, JSON.parse error handling) recommended as follow-ups or pre-merge if time permits.


Automated review via Claude Code + Snyk security scanning

name: implementation-orchestration
description: Agent orchestration for IMPLEMENT intent — pre-flight, Coder, quality gates
user-invocable: false
allowed-tools: Read, Grep, Glob, Bash, Task, AskUserQuestion
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orchestration skills declare Task tool with no precedent

The 'Task' tool appears in allowed-tools for the first time across 32+ existing skills. No other skill in the codebase lists Task.

This is either:

  1. A new tool permission that should be documented in as a convention for orchestration-tier skills, OR
  2. A mistaken addition that should be removed

Suggested fix: Document why orchestration skills need Task access, or remove it if not required.

Confidence: HIGH - verified against all 32 pre-existing skills

| test-driven-development | Always for BUILD | `*.ts`, `*.tsx`, `*.js`, `*.jsx`, `*.py` |
| implementation-patterns | Always for BUILD | Any code file |
| search-first | Always for BUILD | Any code file |
| implementation-orchestration | Always for IMPLEMENT | Any — orchestrates agent pipeline |
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search-first skill silently dropped from IMPLEMENT intent

The skill-catalog previously listed 'search-first' as 'Always for BUILD'. This PR removes it without migration or explanation.

The search-first skill enforces 'look before you leap' — ensuring developers check for existing implementations before writing new code. This is still valid for ambient IMPLEMENT work.

Suggested fix:

  1. Add search-first back to the IMPLEMENT intent in skill-catalog, OR
  2. Document the intentional removal with rationale (e.g., 'redundant with Coder agent's core-patterns skill')

Confidence: HIGH - confirmed across 3 review reports


Based on classified intent, invoke each selected skill using the Skill tool.

| Intent | Primary Skills | Secondary (if file type matches) |
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate pipeline mapping tables across 3 files create maintenance burden

The intent-to-pipeline mapping appears in 3 places with inconsistent structure:

  1. ambient.md Phase 4 (lines 64-69)
  2. ambient-router Step 3 (lines 53-58)
  3. ambient-router Step 5 (lines 79-86)

When the next orchestration skill is added, all 3 tables must update in lockstep. Any mismatch causes confusing behavior where documentation contradicts implementation.

Suggested fix: Make ambient-router the single source of truth. Replace ambient.md Phase 4 table with:

**ORCHESTRATED:**
Invoke each selected skill using the Skill tool per the ambient-router's
Step 3 (skill selection) and Step 5 (agent orchestration). The ambient-router
skill is the single source of truth for intent-to-pipeline mapping.

This collapses a 6-line duplication into a 2-line reference.

Confidence: HIGH - verified by Complexity and Architecture reviews


Pass FILES_CHANGED to all quality gate agents.

## Phase 5: Quality Gates
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMPLEMENT pipeline has no scope gate — trivial changes pay full cost

Phase 5 (Quality Gates) runs the full pipeline (Validator, Simplifier, Scrutinizer, Shepherd) regardless of change scope. A one-line config update gets the same 5-6 agent treatment as a multi-file feature.

The Iron Law says 'no shortcut' for quality, but proportionality matters: trivial changes should have trivial overhead.

Suggested fix: Add a Phase 2.5 scope gate after plan synthesis:

If EXECUTION_PLAN affects <= 1 file and <= 20 lines:
  - Run LIGHT pipeline: Coder → Validator (single pass, no retry)
  - Skip Simplifier, Scrutinizer, Shepherd

Otherwise: full pipeline (Phase 3-6)

This preserves the Iron Law for substantive changes while avoiding expensive pipelines for trivial work.

Confidence: HIGH - Performance review H1

@dean0x
Copy link
Owner Author

dean0x commented Mar 19, 2026

Code Review Summary: PR #149 - Ambient Mode Agent Orchestration

Status: CHANGES_REQUESTED

Nine comprehensive reviews identified 25+ findings across architecture, documentation, complexity, consistency, performance, regression, security, and tests. 5 HIGH-confidence blocking issues plus additional issues requiring documentation updates.


Blocking Issues (Must Fix Before Merge)

1. Integration Tests Broken by Taxonomy Rename

Files: tests/integration/ambient-activation.test.ts, tests/integration/helpers.ts
Severity: CRITICAL

The integration test helpers use hardcoded regex for old taxonomy (BUILD|GUIDED|ELEVATE) but code now emits IMPLEMENT|ORCHESTRATED. Tests will not validate new behavior.

Fix: Update regex patterns to match new taxonomy in helpers.ts and test assertions in ambient-activation.test.ts.

2. Unprecedented 'Task' Tool in allowed-tools

Files: All 3 orchestration skills
Severity: HIGH

No other skill in the codebase requests 'Task' access. This is the first precedent.

Fix: Document why orchestration skills need Task in docs/reference/skills-architecture.md, OR remove it.

3. search-first Skill Silently Dropped

File: shared/skills/ambient-router/references/skill-catalog.md:13
Severity: HIGH

Quality gate removed without explanation.

Fix: Restore search-first to IMPLEMENT, OR document intentional removal.

4. Plugin README.md Completely Stale

Severity: HIGH (Not in current diff)

Shows old 3-tier model and BUILD intent — all removed by this PR.

Fix: Rewrite README for QUICK/ORCHESTRATED model with IMPLEMENT intent and 4 pipelines.

5. Empty CHANGELOG for Breaking Changes

File: CHANGELOG.md
Severity: HIGH

No [Unreleased] entry for breaking changes (intent rename, tier removal, skill removal).

Fix: Add comprehensive [Unreleased] section documenting all breaking changes.


Additional High-Confidence Issues

Testing Gaps (CRITICAL)

  • No new tests for 3 orchestration skills
  • No tests validating agent registry declarations
  • Integration test helpers use removed terminology (inline comment created)

Architectural Documentation

  • Pipeline mapping tables duplicated in 3 places (inline comment created)
  • Undocumented delta from /implement and /debug commands

Performance Issues

  • IMPLEMENT pipeline: No scope gate for trivial changes (inline comment created)
  • DEBUG pipeline: Unbounded exploration depth, no agent budget cap

Consistency Issues

  • 5 different descriptions for ambient mode across 5 locations
  • Ambiguous prompt guidance changes not documented

Inline Comments Created

The following inline comments have been added for high-confidence code issues:

  1. Orchestration skills declare Task tool with no precedent
  2. search-first skill silently dropped from IMPLEMENT
  3. Duplicate pipeline mapping tables across 3 files
  4. IMPLEMENT pipeline has no scope gate for trivial changes

Recommended Fix Priority

  1. Must fix before merge: Update integration tests, add CHANGELOG entry, document/remove Task tool, restore/document search-first (30 min total)
  2. Must fix after merge: Update plugin README as immediate follow-up
  3. Should fix soon: Add unit test for agent registry, implement IMPLEMENT scope gate
  4. Nice to have: De-duplicate pipeline tables, consolidate ambient mode descriptions

Review Summary: Architecture sound, implementation pace exceeded documentation/testing. Core two-tier QUICK/ORCHESTRATED model is solid. All 9 reviewers agree on blocking issues.

…dget

Code review fixes for PR #149:

- Reinstate GUIDED as middle tier: QUICK / GUIDED / ORCHESTRATED
  - GUIDED: small scope (≤2 files), main session + skills + Simplifier
  - ORCHESTRATED: large scope (>2 files), full agent pipeline
  - Scope-based split per intent (IMPLEMENT, DEBUG, PLAN, REVIEW)
- Add search-first to Coder agent permanent skills
- Add hard cap of 8 Explore agents for DEBUG pipeline
- Fix stale integration tests: BUILD→IMPLEMENT, ELEVATE→ORCHESTRATED
- Update plugin README, ambient command, hook preamble for three tiers
- Add CHANGELOG entry under [Unreleased]
|--------|---------------|----------------------------------|
| **BUILD** | test-driven-development, implementation-patterns, search-first | typescript (.ts), react (.tsx/.jsx), go (.go), java (.java), python (.py), rust (.rs), frontend-design (CSS/UI), input-validation (forms/API), security-patterns (auth/crypto) |
| **DEBUG** | test-patterns, core-patterns | git-safety (if git operations involved) |
| **IMPLEMENT** | implementation-patterns, search-first | typescript (.ts), react (.tsx/.jsx), go (.go), java (.java), python (.py), rust (.rs), frontend-design (CSS/UI), input-validation (forms/API), security-patterns (auth/crypto) |
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing TDD in GUIDED IMPLEMENT

The GUIDED-depth IMPLEMENT row lists implementation-patterns, search-first but omits test-driven-development. However:

  • Line 91 contains a conditional: "If test-driven-development is selected (IMPLEMENT intent), you MUST write the failing test"
  • The test-driven-development/SKILL.md documents: "IMPLEMENT/GUIDED -> TDD enforced in main session"
  • plugins/devflow-ambient/README.md:70 advertises: "test-driven-development -- TDD enforcement for IMPLEMENT (GUIDED + ORCHESTRATED)"

This creates a broken contract: TDD is documented as enforced at GUIDED depth, but the router never selects it there.

Fix: Add test-driven-development to the GUIDED IMPLEMENT primary skills:

| **IMPLEMENT** | implementation-patterns, search-first, test-driven-development | typescript (.ts), ... |

Confidence: 95% (HIGH - Regression)

@@ -32,13 +32,23 @@ fi

# Inject classification preamble
PREAMBLE="AMBIENT MODE ACTIVE: Before responding, silently classify this prompt:
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hook preamble bloat undermines QUICK zero-overhead promise

The PREAMBLE string grew from ~40 words to ~150 words and is injected as additionalContext on EVERY user prompt ≥2 words. This contradicts the design promise: "QUICK: Zero overhead."

Impact: Over a 50-prompt session, ~7,500 tokens consumed by preamble injection alone, adding latency to every prompt.

Suggested fix: Reduce preamble to minimal trigger (~25 words):

PREAMBLE="AMBIENT MODE ACTIVE: Classify this prompt using the ambient-router skill in your session context.
Default to QUICK. Only state classification for GUIDED/ORCHESTRATED."

This cuts injection from ~150 words to ~25 words (83% reduction). The full classification logic lives in ambient-router SKILL.md which is already loaded.

Confidence: 95% (HIGH - Performance)


Pass FILES_CHANGED to all quality gate agents.

## Phase 5: Quality Gates
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMPLEMENT pipeline missing agent budget cap

The debug-orchestration skill correctly caps Explore agents at 8 total (your commit 15849ce). The implementation-orchestration pipeline lacks an equivalent cap.

In worst case (both Validator and Shepherd exhaust retries), the pipeline spawns:

  • Coder → Validator (retry 2x) → Coder-fix → Validator → Simplifier → Scrutinizer → Validator (re-validate) → Shepherd (retry 2x) = up to 12 agent invocations per ORCHESTRATED prompt

Impact: 300K-600K tokens, 5-15 minutes per prompt. This is the most expensive path in the system.

Suggested fix: Add explicit agent budget to Phase 5:

## Agent Budget

Hard cap: **8 total agent spawns** across all phases.

| Phase | Allocation |
|-------|-----------|
| Phase 3 (Coder) | 1 (+ up to 1 retry = 2 max) |
| Phase 5 (Quality Gates) | Up to 5 (Validator + Simplifier + Scrutinizer + re-Validate + Shepherd) |
| Phase 5 (Retries) | Retries consume from remaining budget |

If budget exhausted, halt and report what passed.

Confidence: 90% (HIGH - Performance)

"orchestration",
"agents"
],
"agents": [
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Explore agent in plugin manifest

The agents array lists: coder, validator, simplifier, scrutinizer, shepherd, skimmer, reviewer

But the orchestration skills explicitly reference Explore agents:

  • debug-orchestration/SKILL.md:34 -- "Spawn one Explore agent per hypothesis"
  • debug-orchestration/SKILL.md:46 -- "Spawn 2-3 Explore agents"
  • plan-orchestration/SKILL.md:33 -- "spawn 2-3 Explore agents"

These are ephemeral Task subagents, not declared shared agent definitions. However, the architectural contract is unclear. Either:

  1. Add an Explore agent definition to shared/agents/ and include "explore" in the agents array, OR
  2. Document the ephemeral subagent pattern in the orchestration skills clarifying that "Explore" means Task(subagent_type="Explore"), not a declared agent

Option 1 is stronger (single source of truth across all commands using Explore agents).

Confidence: 85% (HIGH - Architecture)

@dean0x
Copy link
Owner Author

dean0x commented Mar 19, 2026

Code Review Comments — PR #149

1. TDD Enforcement Gap (HIGH — Consistency + Regression)

Files: shared/skills/ambient-router/SKILL.md:68, shared/skills/ambient-router/references/skill-catalog.md

The GUIDED-depth IMPLEMENT row omits test-driven-development from the skill selection matrix, but the documentation contradicts this:

  • The README advertises: "test-driven-development -- TDD enforcement for IMPLEMENT (GUIDED + ORCHESTRATED)"
  • The skill itself documents: "IMPLEMENT/GUIDED -> TDD enforced in main session"
  • Step 4 line 91 has a dead conditional: "If test-driven-development is selected (IMPLEMENT intent)..."

Root cause: TDD was removed from the matrix but documentation and conditionals remain.

Fix: Add test-driven-development to the GUIDED IMPLEMENT primary skills:

| **IMPLEMENT** | implementation-patterns, search-first, test-driven-development |

Also add to skill-catalog.md IMPLEMENT row.


2. Hook Preamble 3x Oversized (HIGH — Performance)

File: scripts/hooks/ambient-prompt:34-51

The PREAMBLE injected on every ambient prompt grew from ~40 words to ~150 words. This contradicts the "QUICK: zero overhead" promise:

  • Injected on every prompt ≥2 words
  • ~7,500 tokens per 50-prompt session
  • Full classification rules duplicate content already in ambient-router SKILL

Fix: Trim to minimal trigger (~25 words):

PREAMBLE="AMBIENT MODE ACTIVE: Classify this prompt using the ambient-router skill in your session context. Default to QUICK. Only state classification for GUIDED/ORCHESTRATED."

3. IMPLEMENT Pipeline Missing Agent Budget Cap (HIGH — Performance)

File: shared/skills/implementation-orchestration/SKILL.md:66-76

The DEBUG pipeline has a hard cap (8 agents), but IMPLEMENT has none. Worst case: 12 agent invocations (Coder + Validator with retries + Simplifier + Scrutinizer + Shepherd with retries).

Fix: Add budget section:

## Agent Budget

Hard cap: 8 total agent spawns across all phases.

| Phase | Allocation |
|-------|-----------|
| Phase 3 (Coder) | 1 (+ up to 1 retry = 2 max) |
| Phase 5 (Quality Gates) | Up to 5 (Validator + Simplifier + Scrutinizer + Shepherd) |

If budget is exhausted, halt and report results.

4. Missing git Agent in Ambient Plugin (HIGH — Security + Architecture)

Files: plugins/devflow-ambient/.claude-plugin/plugin.json:18-26, src/cli/plugins.ts:75

The Coder performs git operations (branch setup, commit, push) but the git agent is missing. The /implement plugin includes git for additional safety guardrails.

Fix: Add "git" to the agents array in both plugin.json and plugins.ts.


5. Missing Explore Agent in Plugin Manifest (HIGH — Architecture)

Files: plugins/devflow-ambient/.claude-plugin/plugin.json:18-26, src/cli/plugins.ts:75

Both debug-orchestration and plan-orchestration spawn Explore agents, but Explore is missing from the agents array. This creates an unclear contract for agent orchestration.

Fix: Add "explore" to agents array, OR add a note clarifying that Explore is an ephemeral Task(subagent_type="Explore") not a shared agent definition.


6. Plugin Description Drifts Across Locations (HIGH — Consistency)

Files: Multiple locations

Inconsistent descriptions found:

  • plugin.json:3 + plugins.ts:73: "intent classification with proportional agent orchestration"
  • plugins/devflow-ambient/README.md:69: "intent classification with agent orchestration" (missing "proportional")
  • .claude-plugin/marketplace.json:94: Still reads "auto-loads relevant skills" (stale)
  • src/cli/commands/init.ts:108: Still reads "auto-loads relevant skills" (stale)

Fix:

  1. Update marketplace.json and init.ts to match the updated description
  2. Align all descriptions to: "Ambient mode — intent classification with proportional agent orchestration"

7. Pipeline Logic Duplication (HIGH — Architecture)

Files: shared/skills/implementation-orchestration/SKILL.md, plugins/devflow-implement/commands/implement.md (referenced)

The implementation-orchestration skill defines a 6-phase pipeline that substantially overlaps with the /implement command's 15-phase pipeline. This creates two sources of truth for the same workflow, risking maintenance drift.

Recommendation: Document the intentional delta explicitly at the top of implementation-orchestration/SKILL.md:

## Relationship to /implement Command

This is the lightweight ambient variant. Excluded from ambient ORCHESTRATED:
- Git agent (branch management)
- Skimmer + Explore codebase orientation
- Parallel Plan agents
- Sequential/parallel Coder strategies
- PR creation
- Knowledge persistence

For full lifecycle, use `/implement` directly.

Lower-Confidence Issues (60-79% — Summary Comment Only)

The following issues are captured in the summary comment below due to lower confidence or architectural nuance. Consider addressing in follow-ups:

  • Test assertion loosened (ambient-activation.test.ts:40): Now accepts both GUIDED and ORCHESTRATED for "add a login form"
  • Scope-based split validation weakened (arch review): Architecture M3 notes loss of strict depth testing
  • Cross-skill coupling (arch review): debug-orchestration dynamically loads implementation-orchestration
  • Informal agent references (consistency review): "Explore agent" vs Task(subagent_type=...)
  • No test for ORCHESTRATED path (tests review): Headline feature lacks dedicated test coverage
  • Decision tables duplicated (complexity review): Classification rules repeated across 4 files

Review Status: Approve after addressing the 7 HIGH issues above. All are straightforward fixes with clear suggested implementations.

@@ -1,18 +1,21 @@
# devflow-ambient
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠ Plugin README is entirely stale -- shows removed model

This README still documents the old architecture:

  • Line 3: "AUTO-LOADS RELEVANT SKILLS" -- updated to "agent orchestration" elsewhere
  • Lines 13-20: Examples show old depth tiers (GUIDED, ELEVATE) that no longer exist
  • Lines 41-45: "Depth Tiers" table has stale ELEVATE tier
  • Line 45: ELEVATE description "workflow nudge" -- this tier was replaced with ORCHESTRATED agent pipelines

The README was updated in the first commit but describes a model that doesn't match the actual code. Users reading this will see:

  • "BUILD/GUIDED" when the code implements "IMPLEMENT/GUIDED"
  • "ELEVATE" tier when it's now "ORCHESTRATED"
  • "GUIDED (2-3 skills)" when GUIDED now means main session + Simplifier
  • "ELEVATE (workflow nudge)" when ORCHESTRATED means full agent pipelines

This is the primary user-facing documentation for ambient mode.

Fix: Complete rewrite to match the new three-tier model (QUICK/GUIDED/ORCHESTRATED) with examples showing new behavior.

Confidence: 100% (HIGH - Documentation)

@dean0x
Copy link
Owner Author

dean0x commented Mar 19, 2026

Medium-Priority Issues & Architectural Notes

Architecture & Design

  1. Plan Agent Reference Ambiguous (plan-orchestration/SKILL.md:43)

    • "Spawn Plan agent" referenced without formal agent definition
    • Clarify: Is this Task(subagent_type="Plan") or use Synthesizer agent instead?
  2. Cross-Skill Coupling (debug-orchestration/SKILL.md:73)

    • debug-orchestration dynamically loads implementation-orchestration
    • Creates implicit peer dependency violating Dependency Inversion
    • Consider inlining fix pipeline or formally declaring dependency
  3. Removed Escalation Path (ambient-router/SKILL.md)

    • ELEVATE (passive: "consider /implement") was removed
    • No guidance for tasks exceeding ambient capacity
    • Add note: "For full lifecycle, use /implement"

Performance Optimization (Non-blocking)

  1. No Scope-Proportional Fast-Path (implementation-orchestration/SKILL.md:66-76)

    • Quality gates run unconditionally regardless of change scope
    • Suggested: Skip Scrutinizer + re-Validate if FILES_CHANGED ≤ 3 and Validator passes
  2. No Skill Caching (ambient-router/SKILL.md:88-96)

    • Skills reloaded per invocation even if already in session
    • Suggested: Track loaded skills, skip if no compaction occurred
  3. Regex Pattern Duplication (tests/integration/helpers.ts:40,55,63)

    • Classification regex defined 3 times (all updated correctly in this PR)
    • Suggested refactor: Single constant, derived functions

Test Coverage Gaps

  1. DEBUG Test Can Pass with Zero Assertions (ambient-activation.test.ts:50-56)

    • `if (hasClassification(...))" with no fallback means test can pass vacuously
    • Add: expect(output.length).toBeGreaterThan(20) as fallback
  2. No Explicit Plugin Registry Test (plugins.ts:72-77)

    • Core structural change (agents + skills added) has only implicit coverage
    • Suggested: Add test asserting devflow-ambient.agents.includes('coder') etc
  3. No ORCHESTRATED Depth Test (tests/integration/ambient-activation.test.ts)

    • Headline feature (5-6 agent spawns) lacks dedicated test
    • Suggested: Add test for large-scope classification

Documentation & Consistency (Non-blocking)

  1. Coder Agent Skills Not Documented (shared/agents/coder.md:5)

    • Frontmatter gained search-first + test-driven-development
    • Body text (Responsibilities) doesn't reflect TDD or search-first behavior
    • Add brief note in Responsibilities section
  2. New Task Tool Pattern Undocumented (all orchestration skills:5)

    • allowed-tools: Task is unprecedented in codebase
    • Document in docs/reference/skills-architecture.md or CLAUDE.md
  3. TASK_ID Example Year Inconsistent (coder.md:15 vs implementation-orchestration.md:44)

    • 2025 vs 2026 — minor but suggest align to 2026

Summary: What's Working Well

✅ BUILD → IMPLEMENT, ELEVATE → ORCHESTRATED terminology thoroughly applied
✅ Classification conservatism (default QUICK) is good principle
✅ Three-tier model (QUICK/GUIDED/ORCHESTRATED) well-reasoned for scope-based split
✅ Debug agent budget cap (8 agents) shows performance awareness
✅ Integration tests updated correctly for new vocabulary
✅ Three new orchestration skills follow consistent structure (phases, Iron Law, error handling)


Recommendation

Conditional Approve: Merge after fixing the 7 HIGH issues (TDD, preamble bloat, IMPLEMENT budget, git agent, Explore manifest, descriptions, init.ts/marketplace).

The 10 MEDIUM issues (cross-skill coupling, Plan ambiguity, fast-path, caching, router density, duplication, Task tool, agent references, EXPLORE downgrade, README change) are valuable improvements but not blocking if documented in CHANGELOG.

@dean0x
Copy link
Owner Author

dean0x commented Mar 19, 2026

Code Review Summary

PR: #149 | Branch: feat/ambient-orchestration → main


Overall Assessment

This PR implements ambient mode agent orchestration with three well-structured new orchestration skills and a revised classification system (QUICK/GUIDED/ORCHESTRATED). The architecture is sound with clear phased pipelines and defined error handling. However, 5 blocking issues (≥80% confidence) require attention before merge:

Blocking Issues (≥80% Confidence)

HIGH - Consistency/Regression Issues:

  1. test-driven-development missing from GUIDED IMPLEMENT selection (Consistency + Regression, 95% confidence)

    • File: shared/skills/ambient-router/SKILL.md:68
    • TDD is documented as enforced at GUIDED depth but omitted from the router's selection matrix
    • Creates broken contract: TDD skill says it's loaded, but router never selects it
    • Fix: Add test-driven-development to GUIDED IMPLEMENT primary skills
  2. Hook preamble bloat undermines QUICK zero-overhead promise (Performance, 95% confidence)

    • File: scripts/hooks/ambient-prompt:34
    • Preamble grew from ~40 to ~150 words, injected on EVERY prompt ≥2 words
    • Contradicts "QUICK: Zero overhead" design promise
    • 7,500 tokens wasted per 50-prompt session
    • Fix: Reduce preamble to ~25 words, reference ambient-router skill already in context

HIGH - Architecture/Performance Issues:
3. IMPLEMENT pipeline missing agent budget cap (Performance, 90% confidence)

  • File: shared/skills/implementation-orchestration/SKILL.md:66
  • No budget cap on retry loops; worst case = 12 agent spawns per prompt
  • debug-orchestration correctly caps at 8 agents, but IMPLEMENT doesn't
  • Asymmetric cost exposure: 300K-600K tokens, 5-15 minutes per prompt
  • Fix: Add explicit agent budget section
  1. Missing Explore agent in plugin manifest (Architecture, 85% confidence)
    • File: plugins/devflow-ambient/.claude-plugin/plugin.json:18
    • Orchestration skills explicitly reference Explore agents in 3 places, but manifest lists 6 agents (missing Explore)
    • These are ephemeral Task subagents, but architectural contract is unclear
    • Fix: Either add Explore agent definition to shared/agents/ OR document ephemeral subagent pattern

HIGH - Documentation Issues:
5. Plugin README entirely stale (Documentation, 100% confidence)

  • File: plugins/devflow-ambient/README.md:1
  • Still documents old architecture (BUILD/GUIDED/ELEVATE) that no longer exists
  • Shows examples with removed tiers
  • This is the primary user-facing documentation
  • Fix: Rewrite to match three-tier model with new behavior examples

Lower-Confidence Findings (60-79%)

Suggested improvements for follow-up (not blocking):

  • Pipeline duplication (Architecture, M1): orchestration skills duplicate logic from /implement and /debug without documenting relationship
  • Table duplication (Complexity): Classification rules repeated across 4 files (router, command, README, hook)
  • Cross-skill coupling (M1): debug-orchestration dynamically loads implementation-orchestration
  • Test coverage gaps: No explicit test for new plugin agents/skills, no test for ORCHESTRATED depth path

What's Going Well

  • ✅ Three-tier classification model (QUICK/GUIDED/ORCHESTRATED) is well-reasoned and clear
  • ✅ Scope-based split criteria for GUIDED vs ORCHESTRATED are practical
  • ✅ Orchestration skills follow established conventions (Iron Laws, phased pipelines, error handling)
  • ✅ Second commit (15849ce) fixes previous issues (GUIDED restored, debug budget capped)
  • ✅ BUILD→IMPLEMENT and ELEVATE→ORCHESTRATED renames thorough and consistent across all code files
  • ✅ Test helpers and integration tests properly updated
  • ✅ Coder agent enriched with TDD and search-first skills

Files with Inline Comments

See discussion threads for specific line-by-line feedback:

  • shared/skills/ambient-router/SKILL.md -- TDD selection matrix
  • scripts/hooks/ambient-prompt -- Preamble bloat
  • shared/skills/implementation-orchestration/SKILL.md -- Agent budget
  • plugins/devflow-ambient/.claude-plugin/plugin.json -- Explore agent
  • plugins/devflow-ambient/README.md -- Stale documentation

Next Steps: Address the 5 blocking issues above, then this PR is ready to merge. Estimated effort: ~2-3 hours.


Review completed using github-patterns skill with rate-limiting awareness (1-2s between API calls, 60s wait when approaching rate limit). All findings deduplicated across 9 reviewer reports.

@dean0x
Copy link
Owner Author

dean0x commented Mar 19, 2026

Code Review Summary — PR #149

Branch: feat/ambient-orchestration → main
Reviewers: 7 specialized agents (Security, Architecture, Performance, Complexity, Consistency, Regression, Tests)
Files Changed: 18 (+497, -141)
Confidence Threshold: ≥80% (all comments above)


Verdict

Status: CHANGES REQUESTED

7 HIGH blocking issues require fixes before merge:

  1. TDD missing from GUIDED skill selection matrix
  2. Hook preamble 3x oversized (performance regression)
  3. IMPLEMENT pipeline lacks agent budget cap
  4. Missing git agent in ambient plugin
  5. Missing Explore agent in plugin manifest
  6. Plugin description drifts across 5 locations
  7. Pipeline logic duplication (DRY violation)

10 MEDIUM architectural issues recommended for same PR or follow-up:

  • Cross-skill coupling (debug→implementation)
  • Plan agent reference ambiguous
  • No scope-fast-path in quality gates
  • No skill caching across invocations
  • ambient-router grown to 141 lines (density)
  • ambient.md duplicates router logic
  • New Task tool pattern undocumented
  • Informal agent references (Explore/Plan)
  • EXPLORE downgraded to always QUICK (undocumented)
  • README reclassification undocumented

Scores by Dimension

Dimension Score Assessment
Security 8/10 Fundamentally sound; document Bash scope and rate-limiting
Architecture 6/10 Well-reasoned three-tier model; fix duplication + clarity gaps
Performance 7/10 Good classification conservatism; preamble bloat is main concern
Complexity 7/10 Clear phases and structure; consolidate duplicated tables
Consistency 6/10 CRITICAL: TDD missing from matrix contradicts documentation
Regression 6/10 CRITICAL: TDD enforcement lost for main GUIDED path
Tests 5/10 Vocabulary updated correctly; headline feature lacks coverage

Overall: 6.4/10 — Well-designed feature with significant implementation gaps


Key Strengths

✅ Scope-based classification (≤2 vs >2 files) is practical and clear
✅ Classification conservatism (default QUICK) reduces false positives
✅ Three-tier model (QUICK/GUIDED/ORCHESTRATED) solves progressive disclosure problem
✅ Orchestration skills follow consistent linear structure (phases, Iron Law, error handling)
✅ Debug pipeline budget cap shows performance discipline
✅ Thorough terminology migration (BUILD→IMPLEMENT, ELEVATE→ORCHESTRATED) across all files


What Must Be Fixed

  1. Restore TDD to GUIDED/IMPLEMENT — it's the signature quality behavior for the most common tier
  2. Trim hook preamble — 150 words injected per prompt contradicts "zero overhead"
  3. Cap IMPLEMENT agent budget — asymmetry with DEBUG is a resource exhaustion risk
  4. Add git agent — maintain safety parity with /implement
  5. Clarify Explore contract — define if ephemeral or shared agent
  6. Unify descriptions — sync plugin.json, plugins.ts, README, marketplace.json, init.ts
  7. Document duplication — clarify relationship between orchestration skills and explicit commands

Effort Estimate

  • TDD fix: 5 min (add to matrix + catalog)
  • Preamble trim: 10 min (reduce 150 words → 25 words)
  • Budget cap: 10 min (add table to implementation-orchestration)
  • git agent: 5 min (add to two locations)
  • Descriptions: 10 min (edit 5 files)
  • Explore/Plan clarification: 15 min (add notes + decide on formal definition)
  • Documentation: 10 min (top-of-file note in orchestration skills)

Total: ~65 minutes


Next Steps

  1. Address all 7 HIGH issues in this PR
  2. Address 10 MEDIUM issues in same PR or file follow-up issues
  3. Verify hook preamble trim via manual ambient mode testing
  4. Add tests for ORCHESTRATED path and plugin registry
  5. Merge when all HIGH + MEDIUM issues resolved

Generated by: Claude Code Review Agents
Review Date: 2026-03-19
PR: #149

B1: Restore TDD to GUIDED/IMPLEMENT skill selection (was missing)
B2: Trim hook preamble to ~30 words + add git keyword fast-path
B3: Remove hard-capped 8-agent budget from debug-orchestration

S1: Document pipeline delta (vs /implement, /debug) in orchestration skills
S3: Add ORCHESTRATED classification integration test
S4: Update stale marketplace.json + init.ts descriptions
S5: Fix zero-assertion tests (remove if-guard, add expect)
S6: Change debug Phase 5 from implementation-orchestration to GUIDED fix
S7: Standardize agent references to Task(subagent_type="X") format
S8: Extract CLASSIFICATION_PATTERN constant in test helpers
S9: Document behavioral changes in CHANGELOG
S10: Full rewrite of ambient plugin README (hook-based, no /ambient)

Also removes redundant /ambient slash command — ambient mode is hook-only.
@dean0x dean0x merged commit 8388c12 into main Mar 19, 2026
3 checks passed
@dean0x dean0x deleted the feat/ambient-orchestration branch March 19, 2026 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant