feat(ambient): add agent orchestration to ambient mode by dean0x · Pull Request #149 · dean0x/devflow

dean0x · 2026-03-19T07:49:36Z

Summary

Adds ORCHESTRATED depth tier to ambient mode — spawns agent pipelines for complex multi-file tasks
Three-tier system: QUICK (zero overhead), GUIDED (skills + main session), ORCHESTRATED (skills + agent orchestration)
Three new orchestration skills: implementation-orchestration, debug-orchestration, plan-orchestration
Removes redundant /ambient slash command — ambient mode is hook-only via devflow ambient --enable
Trims hook preamble to ~30 words with git keyword fast-path for zero-overhead git ops

What Changed

New skills (3):

implementation-orchestration — Lightweight /implement pipeline: pre-flight → Coder → Validator → Simplifier → Scrutinizer → Shepherd
debug-orchestration — Lightweight /debug pipeline: hypotheses → parallel Explores → convergence → report → offer fix (GUIDED, not orchestrated)
plan-orchestration — Lightweight Plan pipeline: Skimmer → Explores → Plan agent → gap validation

Modified:

ambient-router SKILL.md — Three-tier depth classification, ORCHESTRATED skill selection, Step 5 agent orchestration
ambient-prompt hook — Trimmed preamble, git keyword fast-path
plugin.json — Added orchestration skills + agents to manifest
skill-catalog.md — Full rewrite with GUIDED/ORCHESTRATED depth columns
Test helpers — Extracted CLASSIFICATION_PATTERN constant, fixed extractDepth to use capture group 2
Integration tests — Fixed zero-assertion tests, added ORCHESTRATED classification test

Removed:

/ambient command (ambient.md) — hook handles everything

Behavioral Changes

EXPLORE intent always classifies as QUICK (was split QUICK/GUIDED)
Simple text edits ("Update the README") classify as QUICK (was BUILD/GUIDED)
Debug agent budget cap removed — agents scale to investigation needs
BUILD intent renamed to IMPLEMENT

Test Plan

npm run build — 35 skills, 17 plugins, all assets distributed
npm test — 250/250 tests pass
npm run test:integration — Integration tests (requires claude CLI + ambient enabled)
Manual: devflow ambient --status works
Manual: git operations get zero overhead (fast-path)
Manual: "add a login form" triggers IMPLEMENT/GUIDED with TDD
Manual: "refactor auth across API, DB, and frontend" triggers IMPLEMENT/ORCHESTRATED

Replace passive skill-only ambient mode with full agent orchestration. Two depth tiers: QUICK (zero overhead) and ORCHESTRATED (skills + agent pipelines). Three new orchestration skills drive intent-specific pipelines: - implementation-orchestration: pre-flight → Coder → quality gates - debug-orchestration: competing hypotheses → parallel Explores → root cause - plan-orchestration: Skimmer → Explores → Plan agent → gap validation Key changes: - Remove ELEVATE tier, replace GUIDED with ORCHESTRATED - Rename BUILD intent to IMPLEMENT for clarity - Skills loaded via Skill tool instead of Read (fixes broken loading) - Add TDD skill to Coder agent frontmatter permanently - Ambient plugin now includes 7 agents + 4 skills - Update ambient-prompt hook preamble for new tiers - Classification conservatism: default to QUICK Closes #84 (superseded by Skill tool approach).

dean0x · 2026-03-19T07:59:51Z

Code Review Summary: Ambient Mode Agent Orchestration

PR: #149
Branch: feat/ambient-orchestration → main
Review Date: 2026-03-19
Reviewers: 9 comprehensive scans (Security, Architecture, Performance, Complexity, Consistency, Regression, Tests, TypeScript, Documentation)

Blocking Issues (Must Address Before Merge)

Critical: Stale Integration Tests

Severity: CRITICAL | Confidence: 100% (3 reviewers flagged)
Files: /tests/integration/helpers.ts, /tests/integration/ambient-activation.test.ts

The integration test helpers use regex patterns matching the old taxonomy (BUILD|GUIDED|ELEVATE) but the PR renames these to IMPLEMENT|ORCHESTRATED. Tests will silently pass without validating behavior.

Required Fix:

Update helpers.ts line 40 regex: /(IMPLEMENT|DEBUG|REVIEW|PLAN|EXPLORE|CHAT)\s*\/\s*(QUICK|ORCHESTRATED)/
Update helpers.ts line 55 regex: /(IMPLEMENT|DEBUG|REVIEW|PLAN|EXPLORE|CHAT)/
Update helpers.ts line 63 regex: /\s*(QUICK|ORCHESTRATED)/
Update ambient-activation.test.ts lines 36-54 to expect IMPLEMENT and ORCHESTRATED

High: Plugin README Not Updated

Severity: HIGH | Confidence: 100% (2 reviewers flagged)
File: /plugins/devflow-ambient/README.md

README still documents the removed 3-tier model (QUICK/GUIDED/ELEVATE), old intent names (BUILD), and claims "no agents spawned." Users will see incorrect information about how the plugin works.

Required Fix: Rewrite README to document:

Two-tier model: QUICK (zero overhead) | ORCHESTRATED (skills + agent pipelines)
Intent names: IMPLEMENT, DEBUG, PLAN, REVIEW, EXPLORE, CHAT
New orchestration skills: implementation-orchestration, debug-orchestration, plan-orchestration
Agent pipelines: Coder + quality gates, competing hypothesis investigation, design-focused exploration

High: Missing Agent/Skill Declarations

Severity: HIGH | Confidence: 100% (2 reviewers flagged)
Files: /plugins/devflow-ambient/.claude-plugin/plugin.json, /shared/skills/

Orchestration skills reference "Explore agents" and "Plan agents" in phases, but these are not declared in plugin.json agents array. Also missing: synthesizer agent (used by debug-orchestration Phase 5 for convergence). The ambient plugin also lacks a git agent despite implementation-orchestration Phase 1 performing branch safety checks.

Required Fix:

Audit plugin.json agents list against what orchestration skills actually spawn
Either add missing agents to manifest or document that Explore/Plan are ephemeral Task sub-agents
Consider adding synthesizer for convergence (or clarify that convergence is inline)

High: `search-first` Skill Dropped

Severity: HIGH | Confidence: 100% (2 reviewers flagged)
Files: /shared/skills/ambient-router/SKILL.md:55, /shared/skills/ambient-router/references/skill-catalog.md

search-first was "Always for BUILD intent" in main branch but is completely removed from IMPLEMENT skill selection with no migration note. This skill enforces "research-before-building" and was added as a deliberate quality gate in v1.5.0 (#111).

Required Fix:

Either restore search-first to IMPLEMENT skill selection, OR
Document the intentional removal with rationale (e.g., "redundant when Coder has core-patterns skill")

High-Priority Should-Fix Issues

High: Security - Missing try/catch on JSON.parse

Severity: HIGH | Confidence: 95% (Security reviewer flagged)
File: /src/cli/commands/ambient.ts:33,70,95

Exported utility functions call JSON.parse() without try/catch. Corrupted settings.json would crash the CLI.

Recommended Fix: Wrap JSON.parse in try/catch with user-friendly error message

High: Performance - No Scope Gate for Lightweight IMPLEMENT Tasks

Severity: HIGH | Confidence: 95% (Performance reviewer flagged)
File: /shared/skills/implementation-orchestration/SKILL.md:66-76

All IMPLEMENT prompts trigger the full 5-6 agent pipeline regardless of scope. A one-line change and a multi-file refactor pay the same overhead.

Recommended Fix: Add Phase 2.5 scope gate — if plan affects ≤1 file & ≤20 lines, run LIGHT pipeline (Coder + Validator only)

High: Performance - Unbounded DEBUG Agent Budget

Severity: HIGH | Confidence: 95% (Performance reviewer flagged)
File: /shared/skills/debug-orchestration/SKILL.md:40-46

DEBUG can spawn up to 10 Explore agents (5 initial hypotheses + 2 validation + 3 second-round) with no explicit cap.

Recommended Fix: Add explicit agent budget cap: max 8 total agents (5 initial + 2 validation + 2 second-round)

Summary by Reviewer Dimension

Dimension	Score	Issues	Key Concern
Security	8/10	1 CRITICAL, 3 HIGH, 1 Should-Fix	Bash tool usage scope needs documentation
Architecture	7/10	3 HIGH, 2 MEDIUM	Missing agent declarations; undocumented delta vs `/implement`/`/debug`
Performance	7/10	2 HIGH, 3 MEDIUM	No lightweight path for trivial IMPLEMENT; unbounded DEBUG budget
Complexity	7/10	2 HIGH, 3 MEDIUM	Pipeline tables duplicated 3 places; orchestration skills crosstalk
Consistency	5/10	2 HIGH, 2 MEDIUM	Task tool unprecedented; search-first dropped; 3→2 tier collapse undocumented
Regression	3/10	1 CRITICAL, 3 HIGH	Tests broken; README stale; 5 intentional behavioral changes undocumented
Tests	3/10	1 CRITICAL, 1 HIGH	Zero tests for new orchestration skills; integration tests have wrong taxonomy
TypeScript	8/10	0 CRITICAL, 0 HIGH	One should-fix on untyped JSON.parse; passes strict compilation
Documentation	5/10	2 HIGH, 3 MEDIUM	Plugin README not updated; integration tests stale; no CHANGELOG entry

Pre-existing Issues (Not Blocking)

Should Consider in Follow-up:

Settings file written without atomic write (fs.writeFile not atomic)
Coder agent skill frontmatter growing without bounds (7 skills, may add domain skills)
No shared Explore/Plan agent definitions (4+ commands define inline, inconsistent behavior)
Classification conservatism relies on LLM compliance, no mechanical enforcement
CLAUDE.md mentions "3-tier" system but orchestration skills don't fit existing tiers

Commit Message Note

The commit message states "add agent orchestration to ambient mode" which implies an additive change. In reality, this PR:

Renames intent BUILD → IMPLEMENT
Removes two depth tiers (GUIDED/ELEVATE) → single ORCHESTRATED
Removes two skills from IMPLEMENT (search-first, test-driven-development)
Changes enforcement model (main-session skills → agent orchestration)
Downgrades analytical EXPLORE (QUICK/GUIDED split → always QUICK)

Consider updating the commit message to reflect the scope: refactor(ambient): orchestrate agents with taxonomy rename and tier collapse

Recommendation

CHANGES_REQUESTED

This PR is a significant architectural improvement for ambient mode, but requires fixes in 4 blocking areas (integration tests, plugin README, agent declarations, search-first skill) before merge. All blocking items are documentation/manifest corrections, not design changes.

Unblock Sequence:

Fix integration test regex + assertions (15 min)
Rewrite plugin README (20 min)
Audit & add missing agents/skills to plugin.json (10 min)
Decide on search-first restoration or removal documentation (5 min)

High-priority should-fix items (performance scope gate, debug budget cap, JSON.parse error handling) recommended as follow-ups or pre-merge if time permits.

Automated review via Claude Code + Snyk security scanning

dean0x · 2026-03-19T07:59:51Z

shared/skills/implementation-orchestration/SKILL.md

+name: implementation-orchestration
+description: Agent orchestration for IMPLEMENT intent — pre-flight, Coder, quality gates
+user-invocable: false
+allowed-tools: Read, Grep, Glob, Bash, Task, AskUserQuestion


Orchestration skills declare Task tool with no precedent

The 'Task' tool appears in allowed-tools for the first time across 32+ existing skills. No other skill in the codebase lists Task.

This is either:

A new tool permission that should be documented in as a convention for orchestration-tier skills, OR

A mistaken addition that should be removed

Suggested fix: Document why orchestration skills need Task access, or remove it if not required.

Confidence: HIGH - verified against all 32 pre-existing skills

dean0x · 2026-03-19T07:59:57Z

shared/skills/ambient-router/references/skill-catalog.md

-| test-driven-development | Always for BUILD | `*.ts`, `*.tsx`, `*.js`, `*.jsx`, `*.py` |
-| implementation-patterns | Always for BUILD | Any code file |
-| search-first | Always for BUILD | Any code file |
+| implementation-orchestration | Always for IMPLEMENT | Any — orchestrates agent pipeline |


search-first skill silently dropped from IMPLEMENT intent

The skill-catalog previously listed 'search-first' as 'Always for BUILD'. This PR removes it without migration or explanation.

The search-first skill enforces 'look before you leap' — ensuring developers check for existing implementations before writing new code. This is still valid for ambient IMPLEMENT work.

Suggested fix:

Add search-first back to the IMPLEMENT intent in skill-catalog, OR

Document the intentional removal with rationale (e.g., 'redundant with Coder agent's core-patterns skill')

Confidence: HIGH - confirmed across 3 review reports

dean0x · 2026-03-19T08:00:06Z

shared/skills/ambient-router/SKILL.md

+
+Based on classified intent, invoke each selected skill using the Skill tool.

 | Intent | Primary Skills | Secondary (if file type matches) |


Duplicate pipeline mapping tables across 3 files create maintenance burden

The intent-to-pipeline mapping appears in 3 places with inconsistent structure:

ambient.md Phase 4 (lines 64-69)

ambient-router Step 3 (lines 53-58)

ambient-router Step 5 (lines 79-86)

When the next orchestration skill is added, all 3 tables must update in lockstep. Any mismatch causes confusing behavior where documentation contradicts implementation.

Suggested fix: Make ambient-router the single source of truth. Replace ambient.md Phase 4 table with:

**ORCHESTRATED:** Invoke each selected skill using the Skill tool per the ambient-router's Step 3 (skill selection) and Step 5 (agent orchestration). The ambient-router skill is the single source of truth for intent-to-pipeline mapping.

This collapses a 6-line duplication into a 2-line reference.

Confidence: HIGH - verified by Complexity and Architecture reviews

dean0x · 2026-03-19T08:00:18Z

shared/skills/implementation-orchestration/SKILL.md

+
+Pass FILES_CHANGED to all quality gate agents.
+
+## Phase 5: Quality Gates


IMPLEMENT pipeline has no scope gate — trivial changes pay full cost

Phase 5 (Quality Gates) runs the full pipeline (Validator, Simplifier, Scrutinizer, Shepherd) regardless of change scope. A one-line config update gets the same 5-6 agent treatment as a multi-file feature.

The Iron Law says 'no shortcut' for quality, but proportionality matters: trivial changes should have trivial overhead.

Suggested fix: Add a Phase 2.5 scope gate after plan synthesis:

If EXECUTION_PLAN affects <= 1 file and <= 20 lines: - Run LIGHT pipeline: Coder → Validator (single pass, no retry) - Skip Simplifier, Scrutinizer, Shepherd Otherwise: full pipeline (Phase 3-6)

This preserves the Iron Law for substantive changes while avoiding expensive pipelines for trivial work.

Confidence: HIGH - Performance review H1

dean0x · 2026-03-19T08:01:11Z

Code Review Summary: PR #149 - Ambient Mode Agent Orchestration

Status: CHANGES_REQUESTED

Nine comprehensive reviews identified 25+ findings across architecture, documentation, complexity, consistency, performance, regression, security, and tests. 5 HIGH-confidence blocking issues plus additional issues requiring documentation updates.

Blocking Issues (Must Fix Before Merge)

1. Integration Tests Broken by Taxonomy Rename

Files: tests/integration/ambient-activation.test.ts, tests/integration/helpers.ts
Severity: CRITICAL

The integration test helpers use hardcoded regex for old taxonomy (BUILD|GUIDED|ELEVATE) but code now emits IMPLEMENT|ORCHESTRATED. Tests will not validate new behavior.

Fix: Update regex patterns to match new taxonomy in helpers.ts and test assertions in ambient-activation.test.ts.

2. Unprecedented 'Task' Tool in allowed-tools

Files: All 3 orchestration skills
Severity: HIGH

No other skill in the codebase requests 'Task' access. This is the first precedent.

Fix: Document why orchestration skills need Task in docs/reference/skills-architecture.md, OR remove it.

3. search-first Skill Silently Dropped

File: shared/skills/ambient-router/references/skill-catalog.md:13
Severity: HIGH

Quality gate removed without explanation.

Fix: Restore search-first to IMPLEMENT, OR document intentional removal.

4. Plugin README.md Completely Stale

Severity: HIGH (Not in current diff)

Shows old 3-tier model and BUILD intent — all removed by this PR.

Fix: Rewrite README for QUICK/ORCHESTRATED model with IMPLEMENT intent and 4 pipelines.

5. Empty CHANGELOG for Breaking Changes

File: CHANGELOG.md
Severity: HIGH

No [Unreleased] entry for breaking changes (intent rename, tier removal, skill removal).

Fix: Add comprehensive [Unreleased] section documenting all breaking changes.

Additional High-Confidence Issues

Testing Gaps (CRITICAL)

No new tests for 3 orchestration skills
No tests validating agent registry declarations
Integration test helpers use removed terminology (inline comment created)

Architectural Documentation

Pipeline mapping tables duplicated in 3 places (inline comment created)
Undocumented delta from /implement and /debug commands

Performance Issues

IMPLEMENT pipeline: No scope gate for trivial changes (inline comment created)
DEBUG pipeline: Unbounded exploration depth, no agent budget cap

Consistency Issues

5 different descriptions for ambient mode across 5 locations
Ambiguous prompt guidance changes not documented

Inline Comments Created

The following inline comments have been added for high-confidence code issues:

Orchestration skills declare Task tool with no precedent
search-first skill silently dropped from IMPLEMENT
Duplicate pipeline mapping tables across 3 files
IMPLEMENT pipeline has no scope gate for trivial changes

Recommended Fix Priority

Must fix before merge: Update integration tests, add CHANGELOG entry, document/remove Task tool, restore/document search-first (30 min total)
Must fix after merge: Update plugin README as immediate follow-up
Should fix soon: Add unit test for agent registry, implement IMPLEMENT scope gate
Nice to have: De-duplicate pipeline tables, consolidate ambient mode descriptions

Review Summary: Architecture sound, implementation pace exceeded documentation/testing. Core two-tier QUICK/ORCHESTRATED model is solid. All 9 reviewers agree on blocking issues.

…dget Code review fixes for PR #149: - Reinstate GUIDED as middle tier: QUICK / GUIDED / ORCHESTRATED - GUIDED: small scope (≤2 files), main session + skills + Simplifier - ORCHESTRATED: large scope (>2 files), full agent pipeline - Scope-based split per intent (IMPLEMENT, DEBUG, PLAN, REVIEW) - Add search-first to Coder agent permanent skills - Add hard cap of 8 Explore agents for DEBUG pipeline - Fix stale integration tests: BUILD→IMPLEMENT, ELEVATE→ORCHESTRATED - Update plugin README, ambient command, hook preamble for three tiers - Add CHANGELOG entry under [Unreleased]

dean0x · 2026-03-19T16:46:13Z

shared/skills/ambient-router/SKILL.md

 |--------|---------------|----------------------------------|
-| **BUILD** | test-driven-development, implementation-patterns, search-first | typescript (.ts), react (.tsx/.jsx), go (.go), java (.java), python (.py), rust (.rs), frontend-design (CSS/UI), input-validation (forms/API), security-patterns (auth/crypto) |
-| **DEBUG** | test-patterns, core-patterns | git-safety (if git operations involved) |
+| **IMPLEMENT** | implementation-patterns, search-first | typescript (.ts), react (.tsx/.jsx), go (.go), java (.java), python (.py), rust (.rs), frontend-design (CSS/UI), input-validation (forms/API), security-patterns (auth/crypto) |


Missing TDD in GUIDED IMPLEMENT ❌

The GUIDED-depth IMPLEMENT row lists implementation-patterns, search-first but omits test-driven-development. However:

Line 91 contains a conditional: "If test-driven-development is selected (IMPLEMENT intent), you MUST write the failing test"

The test-driven-development/SKILL.md documents: "IMPLEMENT/GUIDED -> TDD enforced in main session"

plugins/devflow-ambient/README.md:70 advertises: "test-driven-development -- TDD enforcement for IMPLEMENT (GUIDED + ORCHESTRATED)"

This creates a broken contract: TDD is documented as enforced at GUIDED depth, but the router never selects it there.

Fix: Add test-driven-development to the GUIDED IMPLEMENT primary skills:

| **IMPLEMENT** | implementation-patterns, search-first, test-driven-development | typescript (.ts), ... |

Confidence: 95% (HIGH - Regression)

dean0x · 2026-03-19T16:46:23Z

scripts/hooks/ambient-prompt

@@ -32,13 +32,23 @@ fi

 # Inject classification preamble
 PREAMBLE="AMBIENT MODE ACTIVE: Before responding, silently classify this prompt:


Hook preamble bloat undermines QUICK zero-overhead promise ⚠

The PREAMBLE string grew from ~40 words to ~150 words and is injected as additionalContext on EVERY user prompt ≥2 words. This contradicts the design promise: "QUICK: Zero overhead."

Impact: Over a 50-prompt session, ~7,500 tokens consumed by preamble injection alone, adding latency to every prompt.

Suggested fix: Reduce preamble to minimal trigger (~25 words):

PREAMBLE="AMBIENT MODE ACTIVE: Classify this prompt using the ambient-router skill in your session context. Default to QUICK. Only state classification for GUIDED/ORCHESTRATED."

This cuts injection from ~150 words to ~25 words (83% reduction). The full classification logic lives in ambient-router SKILL.md which is already loaded.

Confidence: 95% (HIGH - Performance)

dean0x · 2026-03-19T16:46:35Z

shared/skills/implementation-orchestration/SKILL.md

+
+Pass FILES_CHANGED to all quality gate agents.
+
+## Phase 5: Quality Gates


IMPLEMENT pipeline missing agent budget cap ⚠

The debug-orchestration skill correctly caps Explore agents at 8 total (your commit 15849ce). The implementation-orchestration pipeline lacks an equivalent cap.

In worst case (both Validator and Shepherd exhaust retries), the pipeline spawns:

Coder → Validator (retry 2x) → Coder-fix → Validator → Simplifier → Scrutinizer → Validator (re-validate) → Shepherd (retry 2x) = up to 12 agent invocations per ORCHESTRATED prompt

Impact: 300K-600K tokens, 5-15 minutes per prompt. This is the most expensive path in the system.

Suggested fix: Add explicit agent budget to Phase 5:

## Agent Budget Hard cap: **8 total agent spawns** across all phases. | Phase | Allocation | |-------|-----------| | Phase 3 (Coder) | 1 (+ up to 1 retry = 2 max) | | Phase 5 (Quality Gates) | Up to 5 (Validator + Simplifier + Scrutinizer + re-Validate + Shepherd) | | Phase 5 (Retries) | Retries consume from remaining budget | If budget exhausted, halt and report what passed.

Confidence: 90% (HIGH - Performance)

dean0x · 2026-03-19T16:46:46Z

plugins/devflow-ambient/.claude-plugin/plugin.json

+    "orchestration",
+    "agents"
+  ],
+  "agents": [


Missing Explore agent in plugin manifest ❌

The agents array lists: coder, validator, simplifier, scrutinizer, shepherd, skimmer, reviewer

But the orchestration skills explicitly reference Explore agents:

debug-orchestration/SKILL.md:34 -- "Spawn one Explore agent per hypothesis"

debug-orchestration/SKILL.md:46 -- "Spawn 2-3 Explore agents"

plan-orchestration/SKILL.md:33 -- "spawn 2-3 Explore agents"

These are ephemeral Task subagents, not declared shared agent definitions. However, the architectural contract is unclear. Either:

Add an Explore agent definition to shared/agents/ and include "explore" in the agents array, OR

Document the ephemeral subagent pattern in the orchestration skills clarifying that "Explore" means Task(subagent_type="Explore"), not a declared agent

Option 1 is stronger (single source of truth across all commands using Explore agents).

Confidence: 85% (HIGH - Architecture)

dean0x · 2026-03-19T16:46:52Z

Code Review Comments — PR #149

1. TDD Enforcement Gap (HIGH — Consistency + Regression)

Files: shared/skills/ambient-router/SKILL.md:68, shared/skills/ambient-router/references/skill-catalog.md

The GUIDED-depth IMPLEMENT row omits test-driven-development from the skill selection matrix, but the documentation contradicts this:

The README advertises: "test-driven-development -- TDD enforcement for IMPLEMENT (GUIDED + ORCHESTRATED)"
The skill itself documents: "IMPLEMENT/GUIDED -> TDD enforced in main session"
Step 4 line 91 has a dead conditional: "If test-driven-development is selected (IMPLEMENT intent)..."

Root cause: TDD was removed from the matrix but documentation and conditionals remain.

Fix: Add test-driven-development to the GUIDED IMPLEMENT primary skills:

| **IMPLEMENT** | implementation-patterns, search-first, test-driven-development |

Also add to skill-catalog.md IMPLEMENT row.

2. Hook Preamble 3x Oversized (HIGH — Performance)

File: scripts/hooks/ambient-prompt:34-51

The PREAMBLE injected on every ambient prompt grew from ~40 words to ~150 words. This contradicts the "QUICK: zero overhead" promise:

Injected on every prompt ≥2 words
~7,500 tokens per 50-prompt session
Full classification rules duplicate content already in ambient-router SKILL

Fix: Trim to minimal trigger (~25 words):

PREAMBLE="AMBIENT MODE ACTIVE: Classify this prompt using the ambient-router skill in your session context. Default to QUICK. Only state classification for GUIDED/ORCHESTRATED."

3. IMPLEMENT Pipeline Missing Agent Budget Cap (HIGH — Performance)

File: shared/skills/implementation-orchestration/SKILL.md:66-76

The DEBUG pipeline has a hard cap (8 agents), but IMPLEMENT has none. Worst case: 12 agent invocations (Coder + Validator with retries + Simplifier + Scrutinizer + Shepherd with retries).

Fix: Add budget section:

## Agent Budget

Hard cap: 8 total agent spawns across all phases.

| Phase | Allocation |
|-------|-----------|
| Phase 3 (Coder) | 1 (+ up to 1 retry = 2 max) |
| Phase 5 (Quality Gates) | Up to 5 (Validator + Simplifier + Scrutinizer + Shepherd) |

If budget is exhausted, halt and report results.

4. Missing git Agent in Ambient Plugin (HIGH — Security + Architecture)

Files: plugins/devflow-ambient/.claude-plugin/plugin.json:18-26, src/cli/plugins.ts:75

The Coder performs git operations (branch setup, commit, push) but the git agent is missing. The /implement plugin includes git for additional safety guardrails.

Fix: Add "git" to the agents array in both plugin.json and plugins.ts.

5. Missing Explore Agent in Plugin Manifest (HIGH — Architecture)

Files: plugins/devflow-ambient/.claude-plugin/plugin.json:18-26, src/cli/plugins.ts:75

Both debug-orchestration and plan-orchestration spawn Explore agents, but Explore is missing from the agents array. This creates an unclear contract for agent orchestration.

Fix: Add "explore" to agents array, OR add a note clarifying that Explore is an ephemeral Task(subagent_type="Explore") not a shared agent definition.

6. Plugin Description Drifts Across Locations (HIGH — Consistency)

Files: Multiple locations

Inconsistent descriptions found:

plugin.json:3 + plugins.ts:73: "intent classification with proportional agent orchestration"
plugins/devflow-ambient/README.md:69: "intent classification with agent orchestration" (missing "proportional")
.claude-plugin/marketplace.json:94: Still reads "auto-loads relevant skills" (stale)
src/cli/commands/init.ts:108: Still reads "auto-loads relevant skills" (stale)

Fix:

Update marketplace.json and init.ts to match the updated description
Align all descriptions to: "Ambient mode — intent classification with proportional agent orchestration"

7. Pipeline Logic Duplication (HIGH — Architecture)

Files: shared/skills/implementation-orchestration/SKILL.md, plugins/devflow-implement/commands/implement.md (referenced)

The implementation-orchestration skill defines a 6-phase pipeline that substantially overlaps with the /implement command's 15-phase pipeline. This creates two sources of truth for the same workflow, risking maintenance drift.

Recommendation: Document the intentional delta explicitly at the top of implementation-orchestration/SKILL.md:

## Relationship to /implement Command

This is the lightweight ambient variant. Excluded from ambient ORCHESTRATED:
- Git agent (branch management)
- Skimmer + Explore codebase orientation
- Parallel Plan agents
- Sequential/parallel Coder strategies
- PR creation
- Knowledge persistence

For full lifecycle, use `/implement` directly.

Lower-Confidence Issues (60-79% — Summary Comment Only)

The following issues are captured in the summary comment below due to lower confidence or architectural nuance. Consider addressing in follow-ups:

Test assertion loosened (ambient-activation.test.ts:40): Now accepts both GUIDED and ORCHESTRATED for "add a login form"
Scope-based split validation weakened (arch review): Architecture M3 notes loss of strict depth testing
Cross-skill coupling (arch review): debug-orchestration dynamically loads implementation-orchestration
Informal agent references (consistency review): "Explore agent" vs Task(subagent_type=...)
No test for ORCHESTRATED path (tests review): Headline feature lacks dedicated test coverage
Decision tables duplicated (complexity review): Classification rules repeated across 4 files

Review Status: Approve after addressing the 7 HIGH issues above. All are straightforward fixes with clear suggested implementations.

dean0x · 2026-03-19T16:46:57Z

plugins/devflow-ambient/README.md

@@ -1,18 +1,21 @@
 # devflow-ambient


⚠ Plugin README is entirely stale -- shows removed model

This README still documents the old architecture:

Line 3: "AUTO-LOADS RELEVANT SKILLS" -- updated to "agent orchestration" elsewhere

Lines 13-20: Examples show old depth tiers (GUIDED, ELEVATE) that no longer exist

Lines 41-45: "Depth Tiers" table has stale ELEVATE tier

Line 45: ELEVATE description "workflow nudge" -- this tier was replaced with ORCHESTRATED agent pipelines

The README was updated in the first commit but describes a model that doesn't match the actual code. Users reading this will see:

"BUILD/GUIDED" when the code implements "IMPLEMENT/GUIDED"

"ELEVATE" tier when it's now "ORCHESTRATED"

"GUIDED (2-3 skills)" when GUIDED now means main session + Simplifier

"ELEVATE (workflow nudge)" when ORCHESTRATED means full agent pipelines

This is the primary user-facing documentation for ambient mode.

Fix: Complete rewrite to match the new three-tier model (QUICK/GUIDED/ORCHESTRATED) with examples showing new behavior.

Confidence: 100% (HIGH - Documentation)

dean0x · 2026-03-19T16:47:09Z

Medium-Priority Issues & Architectural Notes

Architecture & Design

Plan Agent Reference Ambiguous (plan-orchestration/SKILL.md:43)
- "Spawn Plan agent" referenced without formal agent definition
- Clarify: Is this Task(subagent_type="Plan") or use Synthesizer agent instead?
Cross-Skill Coupling (debug-orchestration/SKILL.md:73)
- debug-orchestration dynamically loads implementation-orchestration
- Creates implicit peer dependency violating Dependency Inversion
- Consider inlining fix pipeline or formally declaring dependency
Removed Escalation Path (ambient-router/SKILL.md)
- ELEVATE (passive: "consider /implement") was removed
- No guidance for tasks exceeding ambient capacity
- Add note: "For full lifecycle, use /implement"

Performance Optimization (Non-blocking)

No Scope-Proportional Fast-Path (implementation-orchestration/SKILL.md:66-76)
- Quality gates run unconditionally regardless of change scope
- Suggested: Skip Scrutinizer + re-Validate if FILES_CHANGED ≤ 3 and Validator passes
No Skill Caching (ambient-router/SKILL.md:88-96)
- Skills reloaded per invocation even if already in session
- Suggested: Track loaded skills, skip if no compaction occurred
Regex Pattern Duplication (tests/integration/helpers.ts:40,55,63)
- Classification regex defined 3 times (all updated correctly in this PR)
- Suggested refactor: Single constant, derived functions

Test Coverage Gaps

DEBUG Test Can Pass with Zero Assertions (ambient-activation.test.ts:50-56)
- `if (hasClassification(...))" with no fallback means test can pass vacuously
- Add: expect(output.length).toBeGreaterThan(20) as fallback
No Explicit Plugin Registry Test (plugins.ts:72-77)
- Core structural change (agents + skills added) has only implicit coverage
- Suggested: Add test asserting devflow-ambient.agents.includes('coder') etc
No ORCHESTRATED Depth Test (tests/integration/ambient-activation.test.ts)
- Headline feature (5-6 agent spawns) lacks dedicated test
- Suggested: Add test for large-scope classification

Documentation & Consistency (Non-blocking)

Coder Agent Skills Not Documented (shared/agents/coder.md:5)
- Frontmatter gained search-first + test-driven-development
- Body text (Responsibilities) doesn't reflect TDD or search-first behavior
- Add brief note in Responsibilities section
New Task Tool Pattern Undocumented (all orchestration skills:5)
- allowed-tools: Task is unprecedented in codebase
- Document in docs/reference/skills-architecture.md or CLAUDE.md
TASK_ID Example Year Inconsistent (coder.md:15 vs implementation-orchestration.md:44)
- 2025 vs 2026 — minor but suggest align to 2026

Summary: What's Working Well

✅ BUILD → IMPLEMENT, ELEVATE → ORCHESTRATED terminology thoroughly applied
✅ Classification conservatism (default QUICK) is good principle
✅ Three-tier model (QUICK/GUIDED/ORCHESTRATED) well-reasoned for scope-based split
✅ Debug agent budget cap (8 agents) shows performance awareness
✅ Integration tests updated correctly for new vocabulary
✅ Three new orchestration skills follow consistent structure (phases, Iron Law, error handling)

Recommendation

Conditional Approve: Merge after fixing the 7 HIGH issues (TDD, preamble bloat, IMPLEMENT budget, git agent, Explore manifest, descriptions, init.ts/marketplace).

The 10 MEDIUM issues (cross-skill coupling, Plan ambiguity, fast-path, caching, router density, duplication, Task tool, agent references, EXPLORE downgrade, README change) are valuable improvements but not blocking if documented in CHANGELOG.

dean0x · 2026-03-19T16:47:16Z

Code Review Summary

PR: #149 | Branch: feat/ambient-orchestration → main

Overall Assessment

This PR implements ambient mode agent orchestration with three well-structured new orchestration skills and a revised classification system (QUICK/GUIDED/ORCHESTRATED). The architecture is sound with clear phased pipelines and defined error handling. However, 5 blocking issues (≥80% confidence) require attention before merge:

Blocking Issues (≥80% Confidence)

HIGH - Consistency/Regression Issues:

test-driven-development missing from GUIDED IMPLEMENT selection (Consistency + Regression, 95% confidence)
- File: shared/skills/ambient-router/SKILL.md:68
- TDD is documented as enforced at GUIDED depth but omitted from the router's selection matrix
- Creates broken contract: TDD skill says it's loaded, but router never selects it
- Fix: Add test-driven-development to GUIDED IMPLEMENT primary skills
Hook preamble bloat undermines QUICK zero-overhead promise (Performance, 95% confidence)
- File: scripts/hooks/ambient-prompt:34
- Preamble grew from ~40 to ~150 words, injected on EVERY prompt ≥2 words
- Contradicts "QUICK: Zero overhead" design promise
- 7,500 tokens wasted per 50-prompt session
- Fix: Reduce preamble to ~25 words, reference ambient-router skill already in context

HIGH - Architecture/Performance Issues:
3. IMPLEMENT pipeline missing agent budget cap (Performance, 90% confidence)

File: shared/skills/implementation-orchestration/SKILL.md:66
No budget cap on retry loops; worst case = 12 agent spawns per prompt
debug-orchestration correctly caps at 8 agents, but IMPLEMENT doesn't
Asymmetric cost exposure: 300K-600K tokens, 5-15 minutes per prompt
Fix: Add explicit agent budget section

Missing Explore agent in plugin manifest (Architecture, 85% confidence)
- File: plugins/devflow-ambient/.claude-plugin/plugin.json:18
- Orchestration skills explicitly reference Explore agents in 3 places, but manifest lists 6 agents (missing Explore)
- These are ephemeral Task subagents, but architectural contract is unclear
- Fix: Either add Explore agent definition to shared/agents/ OR document ephemeral subagent pattern

HIGH - Documentation Issues:
5. Plugin README entirely stale (Documentation, 100% confidence)

File: plugins/devflow-ambient/README.md:1
Still documents old architecture (BUILD/GUIDED/ELEVATE) that no longer exists
Shows examples with removed tiers
This is the primary user-facing documentation
Fix: Rewrite to match three-tier model with new behavior examples

Lower-Confidence Findings (60-79%)

Suggested improvements for follow-up (not blocking):

Pipeline duplication (Architecture, M1): orchestration skills duplicate logic from /implement and /debug without documenting relationship
Table duplication (Complexity): Classification rules repeated across 4 files (router, command, README, hook)
Cross-skill coupling (M1): debug-orchestration dynamically loads implementation-orchestration
Test coverage gaps: No explicit test for new plugin agents/skills, no test for ORCHESTRATED depth path

What's Going Well

✅ Three-tier classification model (QUICK/GUIDED/ORCHESTRATED) is well-reasoned and clear
✅ Scope-based split criteria for GUIDED vs ORCHESTRATED are practical
✅ Orchestration skills follow established conventions (Iron Laws, phased pipelines, error handling)
✅ Second commit (15849ce) fixes previous issues (GUIDED restored, debug budget capped)
✅ BUILD→IMPLEMENT and ELEVATE→ORCHESTRATED renames thorough and consistent across all code files
✅ Test helpers and integration tests properly updated
✅ Coder agent enriched with TDD and search-first skills

Files with Inline Comments

See discussion threads for specific line-by-line feedback:

shared/skills/ambient-router/SKILL.md -- TDD selection matrix
scripts/hooks/ambient-prompt -- Preamble bloat
shared/skills/implementation-orchestration/SKILL.md -- Agent budget
plugins/devflow-ambient/.claude-plugin/plugin.json -- Explore agent
plugins/devflow-ambient/README.md -- Stale documentation

Next Steps: Address the 5 blocking issues above, then this PR is ready to merge. Estimated effort: ~2-3 hours.

Review completed using github-patterns skill with rate-limiting awareness (1-2s between API calls, 60s wait when approaching rate limit). All findings deduplicated across 9 reviewer reports.

dean0x · 2026-03-19T16:47:24Z

Code Review Summary — PR #149

Branch: feat/ambient-orchestration → main
Reviewers: 7 specialized agents (Security, Architecture, Performance, Complexity, Consistency, Regression, Tests)
Files Changed: 18 (+497, -141)
Confidence Threshold: ≥80% (all comments above)

Verdict

Status: CHANGES REQUESTED

7 HIGH blocking issues require fixes before merge:

TDD missing from GUIDED skill selection matrix
Hook preamble 3x oversized (performance regression)
IMPLEMENT pipeline lacks agent budget cap
Missing git agent in ambient plugin
Missing Explore agent in plugin manifest
Plugin description drifts across 5 locations
Pipeline logic duplication (DRY violation)

10 MEDIUM architectural issues recommended for same PR or follow-up:

Cross-skill coupling (debug→implementation)
Plan agent reference ambiguous
No scope-fast-path in quality gates
No skill caching across invocations
ambient-router grown to 141 lines (density)
ambient.md duplicates router logic
New Task tool pattern undocumented
Informal agent references (Explore/Plan)
EXPLORE downgraded to always QUICK (undocumented)
README reclassification undocumented

Scores by Dimension

Dimension	Score	Assessment
Security	8/10	Fundamentally sound; document Bash scope and rate-limiting
Architecture	6/10	Well-reasoned three-tier model; fix duplication + clarity gaps
Performance	7/10	Good classification conservatism; preamble bloat is main concern
Complexity	7/10	Clear phases and structure; consolidate duplicated tables
Consistency	6/10	CRITICAL: TDD missing from matrix contradicts documentation
Regression	6/10	CRITICAL: TDD enforcement lost for main GUIDED path
Tests	5/10	Vocabulary updated correctly; headline feature lacks coverage

Overall: 6.4/10 — Well-designed feature with significant implementation gaps

Key Strengths

✅ Scope-based classification (≤2 vs >2 files) is practical and clear
✅ Classification conservatism (default QUICK) reduces false positives
✅ Three-tier model (QUICK/GUIDED/ORCHESTRATED) solves progressive disclosure problem
✅ Orchestration skills follow consistent linear structure (phases, Iron Law, error handling)
✅ Debug pipeline budget cap shows performance discipline
✅ Thorough terminology migration (BUILD→IMPLEMENT, ELEVATE→ORCHESTRATED) across all files

What Must Be Fixed

Restore TDD to GUIDED/IMPLEMENT — it's the signature quality behavior for the most common tier
Trim hook preamble — 150 words injected per prompt contradicts "zero overhead"
Cap IMPLEMENT agent budget — asymmetry with DEBUG is a resource exhaustion risk
Add git agent — maintain safety parity with /implement
Clarify Explore contract — define if ephemeral or shared agent
Unify descriptions — sync plugin.json, plugins.ts, README, marketplace.json, init.ts
Document duplication — clarify relationship between orchestration skills and explicit commands

Effort Estimate

TDD fix: 5 min (add to matrix + catalog)
Preamble trim: 10 min (reduce 150 words → 25 words)
Budget cap: 10 min (add table to implementation-orchestration)
git agent: 5 min (add to two locations)
Descriptions: 10 min (edit 5 files)
Explore/Plan clarification: 15 min (add notes + decide on formal definition)
Documentation: 10 min (top-of-file note in orchestration skills)

Total: ~65 minutes

Next Steps

Address all 7 HIGH issues in this PR
Address 10 MEDIUM issues in same PR or file follow-up issues
Verify hook preamble trim via manual ambient mode testing
Add tests for ORCHESTRATED path and plugin registry
Merge when all HIGH + MEDIUM issues resolved

Generated by: Claude Code Review Agents
Review Date: 2026-03-19
PR: #149

B1: Restore TDD to GUIDED/IMPLEMENT skill selection (was missing) B2: Trim hook preamble to ~30 words + add git keyword fast-path B3: Remove hard-capped 8-agent budget from debug-orchestration S1: Document pipeline delta (vs /implement, /debug) in orchestration skills S3: Add ORCHESTRATED classification integration test S4: Update stale marketplace.json + init.ts descriptions S5: Fix zero-assertion tests (remove if-guard, add expect) S6: Change debug Phase 5 from implementation-orchestration to GUIDED fix S7: Standardize agent references to Task(subagent_type="X") format S8: Extract CLASSIFICATION_PATTERN constant in test helpers S9: Document behavioral changes in CHANGELOG S10: Full rewrite of ambient plugin README (hook-based, no /ambient) Also removes redundant /ambient slash command — ambient mode is hook-only.

dean0x commented Mar 19, 2026

View reviewed changes

dean0x merged commit 8388c12 into main Mar 19, 2026
3 checks passed

dean0x deleted the feat/ambient-orchestration branch March 19, 2026 21:57


		Based on classified intent, invoke each selected skill using the Skill tool.

		\| Intent \| Primary Skills \| Secondary (if file type matches) \|


		Pass FILES_CHANGED to all quality gate agents.

		## Phase 5: Quality Gates

		@@ -32,13 +32,23 @@ fi

		# Inject classification preamble
		PREAMBLE="AMBIENT MODE ACTIVE: Before responding, silently classify this prompt:

Conversation

dean0x commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Behavioral Changes

Test Plan

Uh oh!

dean0x commented Mar 19, 2026

Code Review Summary: Ambient Mode Agent Orchestration

Blocking Issues (Must Address Before Merge)

Critical: Stale Integration Tests

High: Plugin README Not Updated

High: Missing Agent/Skill Declarations

High: search-first Skill Dropped

High-Priority Should-Fix Issues

High: Security - Missing try/catch on JSON.parse

High: Performance - No Scope Gate for Lightweight IMPLEMENT Tasks

High: Performance - Unbounded DEBUG Agent Budget

Summary by Reviewer Dimension

Pre-existing Issues (Not Blocking)

Commit Message Note

Recommendation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dean0x commented Mar 19, 2026

Code Review Summary: PR #149 - Ambient Mode Agent Orchestration

Status: CHANGES_REQUESTED

Blocking Issues (Must Fix Before Merge)

1. Integration Tests Broken by Taxonomy Rename

2. Unprecedented 'Task' Tool in allowed-tools

3. search-first Skill Silently Dropped

4. Plugin README.md Completely Stale

5. Empty CHANGELOG for Breaking Changes

Additional High-Confidence Issues

Testing Gaps (CRITICAL)

Architectural Documentation

Performance Issues

Consistency Issues

Inline Comments Created

Recommended Fix Priority

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dean0x commented Mar 19, 2026

Code Review Comments — PR #149

1. TDD Enforcement Gap (HIGH — Consistency + Regression)

2. Hook Preamble 3x Oversized (HIGH — Performance)

3. IMPLEMENT Pipeline Missing Agent Budget Cap (HIGH — Performance)

4. Missing git Agent in Ambient Plugin (HIGH — Security + Architecture)

5. Missing Explore Agent in Plugin Manifest (HIGH — Architecture)

6. Plugin Description Drifts Across Locations (HIGH — Consistency)

7. Pipeline Logic Duplication (HIGH — Architecture)

Lower-Confidence Issues (60-79% — Summary Comment Only)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dean0x commented Mar 19, 2026

Medium-Priority Issues & Architectural Notes

Architecture & Design

Performance Optimization (Non-blocking)

Test Coverage Gaps

Documentation & Consistency (Non-blocking)

Summary: What's Working Well

Recommendation

Uh oh!

dean0x commented Mar 19, 2026

dean0x commented Mar 19, 2026 •

edited

Loading

High: `search-first` Skill Dropped