feat(pov): Parzival 2.1 — Skill Architecture + Dispatch System#65
Open
Hidden-History wants to merge 22 commits intomainfrom
Open
feat(pov): Parzival 2.1 — Skill Architecture + Dispatch System#65Hidden-History wants to merge 22 commits intomainfrom
Hidden-History wants to merge 22 commits intomainfrom
Conversation
…C-20
Deploy Parzival 2.1 integration from verified staging (3 rounds adversarial review,
zero issues). Key changes:
- Shim architecture: skill content in _ai-memory/pov/skills/, thin pointers in
.claude/skills/ (7 skills: agent-dispatch, agent-lifecycle, bmad-dispatch,
model-dispatch, parzival-bootstrap, parzival-constraints, parzival-team-builder)
- GC-19 (spawn agents as teammates) and GC-20 (no instruction in activation)
added to global constraints with full 5-point enforcement integration
- Parzival identity rewritten for direct agent management model
- Agent-dispatch workflow inlines selection guide and combination sequences
- Installer: generate_parzival_skill_shims(), setup_model_dispatch(), stale cleanup
- Config parameterized with {USER_NAME} for installer substitution
- Superseded files removed: teams/, team-prompt workflow, moved templates/data
Decisions: DEC-114 through DEC-120
Staging: oversight/staging/parzival-2.1/ (356 files, PM #195)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Updated 9 existing docs: - PARZIVAL-SESSION-GUIDE: slash commands, menu items, GC-04 compliance, shim refs - README-POV: full rewrite — 17 constraints, 15-item menu, current file structure - CHANGELOG-POV: v2.1.0 entry with upgrade guide - CONSTRAINT-ENFORCEMENT-SYSTEM: full constraint table, Layer 1/3 self-check - ESCALATION-ADOPTION-GUIDE: removed stale file refs - INSTALL-GUIDE-POV: unified installer, shim generation, config sync - BMAD-Multi-Agent-Architecture: 2.1 context, GC-19/20, current paths - Multi-Agent-Research-Tracker: fixed duplicate BP IDs, teammate pattern - SHARDING_STRATEGY: fixed stale bmad-parzival-module paths Created 4 new skill guides: - TEAM-BUILDER-GUIDE: tier selection, sizing, presets, conflict avoidance - AGENT-DISPATCH-GUIDE: 9-step lifecycle, instruction template, corrections - BMAD-DISPATCH-GUIDE: agent selection, activation sequence, DC-08 rule - MODEL-DISPATCH-GUIDE: provider setup, model selection, dispatch modes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adversarial review by 5 parallel verifiers found 26 issues across 8 files: README.md: - Parzival section rewritten for 2.1 (full lifecycle, dispatch skills table) - Added Agent Lifecycle to dispatch table (was missing) - "optional but highly recommended" with cross-session memory rationale CONSTRAINT-ENFORCEMENT-SYSTEM.md (6 fixes): - GC-04 compliance: Parzival dispatches agents, not user - Stale CONSTRAINTS.md/PROCEDURES.md paths replaced - Self-check layer structure corrected README-POV.md (7 fixes): - Self-check "three layers" → "two layers" - Config field oversight_folder → oversight_path - Activation command /pov:agents:parzival → /pov:parzival - Constraint range "GC-01 to GC-20" → "GC-01 through GC-15 + GC-19 + GC-20" CHANGELOG-POV.md: Removed incorrect teams_enabled claim INSTALL-GUIDE-POV.md (6 fixes): Phantom function name, stale paths, slash commands BMAD-DISPATCH-GUIDE.md: Added missing Maintenance + QA phases MODEL-DISPATCH-GUIDE.md: Corrected Vertex AI URL Multi-Agent-Research-Tracker.md: Fixed 16 broken BP source links → Qdrant reference Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Round 2 review (5 verifiers) found 8 remaining issues, all fixed: - config.yaml: Restore teams_enabled: true (required for Claude Code teammates) - CONSTRAINT-ENFORCEMENT-SYSTEM: Fix layer count contradiction (15+2, not 13+4) - MODEL-DISPATCH-GUIDE: Add missing "general" category, video-gen triggers, audio signal - BMAD-DISPATCH-GUIDE: Fix PM workflow phase scope to "Discovery (or any phase)" - INSTALL-GUIDE-POV: Label "dispatch skills" → "skills" (2 of 7 are core, not dispatch) - Multi-Agent-Research-Tracker: Fix stale methodology to reference Qdrant storage R1: 26 issues → R2: 8 issues → converging toward zero. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
R3 review (4 verifiers) found 5 remaining issues: - INSTALL-GUIDE-POV: 3 stale links to removed CONSTRAINTS.md/PROCEDURES.md replaced with current constraint-enforcement-system and workflows refs - BMAD-Multi-Agent-Architecture: Footer date synced to header (2026-03-15) - README-POV: GC-04 behavioral contradiction — Duties sections 2/4/5 still described v1.x prompt-provider model. Updated to v2.1 dispatch model (Parzival activates and manages agents, user manages Parzival only) Convergence: R1=42 → R2=8 → R3=5 (3 stale links + 1 date + 1 behavioral) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
R4 found 4 more instances of v1.x "provide prompt / user runs" language: - README-POV: "Provides review agent prompts" → "Dispatches review agents" - CONSTRAINT-ENFORCEMENT-SYSTEM: Layer 4 diagram + 2 test scenarios updated from prompt-provider model to v2.1 dispatch model All same class as R3 finding. Convergence: R1=42 → R2=8 → R3=5 → R4=4 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add tiktoken, anthropic, langfuse, numpy to critical package verification (these are eagerly imported — missing any crashes the memory module). Add tree-sitter grammars (js, ts, go, rust) and spacy to optional package verification (these have graceful fallbacks but should warn if missing). Previously only 5 critical + 2 optional were checked. Now 9 critical + 7 optional — matches every package the memory module actually imports. Prevents silent degradation: spaCy and tree-sitter gaps went unnoticed for weeks because pip partial failures weren't caught by verification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5dc2586 to
8960d8e
Compare
…ndings resolved BUG-225: Classifier SKIP_RECLASSIFICATION_TYPES expanded to 9 (frozenset), error_pattern regex tightened BUG-218: RRF score clamped with min(0.95, ...) in search.py BUG-226: Edit hook reads full file via Path.read_text() with 200KB guard and OSError fallback BUG-227: Installer Option 1 copies docker/.env.example with error handling BUG-219: Scanner source_type="user_session" added to store_async.py BUG-222: Verified step-03-create-handoff.md already exists (QA false positive) TD-296: 15 full-content skill shims replaced with thin routing shims (82KB→5KB, 93.5% reduction) - 7 POV skill shims removed (dynamically generated by installer) - 2 orphaned canonical directories deleted - Installer deploy_ai_memory_skills() + trigger extraction added 15 review findings from dual-model adversarial review (Sonnet + Opus) also fixed. 2541 tests passing. Zero legitimate issues at convergence. Ref: PLAN-018, PM #200 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ndings fixed Phase 2 items: - TD-275/289: Cherry-picked tags on all 108 emit_trace_event calls - TD-290: Classifier LLM calls wrapped with @observe(as_type="generation") - TD-291: freshness_scan → freshness_scan_complete event naming - TD-292: Quality gate skip Prometheus metric in both hook scripts - TD-262: BMAD_LOG_LEVEL → AI_MEMORY_LOG_LEVEL with deprecation alias - TD-189: Langfuse moved to optional dependency ([observability] extra) - TD-220: SQL injection fix in langfuse_setup.sh (psql -v parameterized) - TD-221: Dead fallback fixed in langfuse_setup.sh:606 - TD-293/294/295: Doc/spec fixes, parzival.md standards + dispatch refs - TD-219/222: Verified already fixed (screenshots dir, macOS sed) - BUG-221: Verified not a bug (MagicMock bypasses frozen) Review findings (25 total, all fixed): - BUG-228: Copy-paste tags in injection.py (bootstrap/format, not greedy_fill) - BUG-229: hook_type label mismatch in user_prompt_store_async.py - BUG-230/231: langfuse_setup.sh dead fallback + hardcoded port - BUG-232: parzival.md dispatch skill paths marked [PLANNED] - BUG-233: E2E conftest.py videos dir + absolute paths - BUG-234: 8 flaky caplog tests fixed (propagate=False, BUG-209 pattern) - BUG-235: test_logging.py deprecation warning coverage - TD-297: injection.py project_id threading to trace events - TD-298: freshness.py per-invocation session_id - TD-299/300: llm_classifier.py top-level imports + narrowed exceptions - TD-301: langfuse_stop_hook.py targeted ImportError guard - TD-302: langfuse_instrument.py removed dead _NoOpGeneration - TD-303: logging_config.py formatter update on reconfig 14 agents across 4 rounds (3 dev → 6 review → 3 fix → 2 re-review). Dual-model adversarial review (Sonnet + Opus) converged at zero issues. 2542 tests passing. Zero failures on both random and deterministic ordering. Ref: PLAN-018, PM #201 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ion_worker setup_hook_logging() sets logger to INFO at import time, overriding caplog.at_level(DEBUG) from the autouse fixture. Add explicit setLevel(DEBUG) after each local import to ensure DEBUG-level log assertions work. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ruff SIM117 requires single with statement with multiple contexts instead of nested with blocks. Three instances at lines 62, 94, 273. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Black version mismatch (local 26.1.0 vs CI 26.3.0) caused formatting differences in files touched by PLAN-018 Phase 1+2 commits. Aligned local to pinned version and reformatted. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…luator config, CHANGELOG v2.3.0 - BUG-219: add source_type="user_session" to store_async.py (all 5 hooks now consistent) - M-12: add _ai-memory/ to .gitignore template in install.sh (existing-file and create-new paths) - Evaluator: re-comment base_url in evaluator_config.yaml, remove /v1 from compose default, add /v1 auto-append in provider.py - CHANGELOG: add v2.3.0 release entry Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eanup Restructure .env handling so docker/.env is the ONLY config file: - Rewrite docker/.env.example with 5-section layout (API Keys, Auto-Generated, Feature Toggles, Configuration, Internal) — 248 lines, zero missing keys - Gut import_user_env() in install.sh (deprecated, warns if legacy root .env exists) - Fix upgrade.sh to read from $INSTALL_DIR/docker/.env (was $INSTALL_DIR/.env) - Fix rollback.sh to restore to $INSTALL_DIR/docker/ with legacy fallback - Fix classifier/config.py to search ~/.ai-memory/docker/.env (was ~/.ai-memory/.env) - Fix config.py pydantic env_file to use absolute path via AI_MEMORY_INSTALL_DIR Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add conftest.py AI_MEMORY_INSTALL_DIR isolation (set before class import) - Fix test_config.py test_env_file_loading to use _env_file= parameter - Fix test_agent_memory.py test_parzival_config_defaults with _env_file=None - Fix test_evaluator_provider.py 3 ollama tests: clear OLLAMA_API_KEY env - Fix test_sops_encryption.py test_secrets_backend_in_env_file with _env_file= - Update CHANGELOG.md v2.3.0 with TD-308 entries - Update INSTALL.md, CONFIGURATION.md, TROUBLESHOOTING.md: docker/.env paths 2543 passed, 0 env-related failures. 3 pre-existing flaky (test ordering). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…se parsing Docker Compose .env parser rejects inline comments containing quote characters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dantic parse failure Move all comments to line above KEY=value (never inline). Docker Compose .env parser and pydantic-settings DotEnvSettingsSource both treat inline # comments as part of the value, causing parse errors. Also set GITHUB_SYNC_ENABLED, JIRA_SYNC_ENABLED, LANGFUSE_ENABLED to false in template — these require credentials that aren't present on fresh install. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
memory.__init__ → storage → chunking → truncation → tiktoken import chain causes github-sync container to crash loop after rebuild. Added tiktoken>=0.5.0,<1.0.0 matching main requirements.txt spec. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…v.example
Uncommented 8 vars that were commented but referenced by docker-compose
with ${VAR:-default} syntax: 5 GitHub sync timeouts, 3 embedding retry
params. Added 3 new vars: QDRANT_TIMEOUT, QDRANT_USE_HTTPS,
AI_MEMORY_QUEUE_DIR. All compose-referenced vars now documented.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…procedures Add BUG-236 fix entry, container rebuild instructions, and post-update troubleshooting section. Document critical rules for docker compose operations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Parzival 2.1 introduces the skill-based dispatch architecture, enabling Parzival to activate and manage parallel agent teams directly via Claude Code teams throughout the full project lifecycle.
Core Changes
Skill shim architecture: Skill content lives in
_ai-memory/pov/skills/(7 skills), with thin shim files in.claude/skills/that point to them. This keeps the POV module self-contained while maintaining Claude Code skill discovery.Dispatch skills: Team Builder (parallel team design), Agent Dispatch (instruction preparation), Agent Lifecycle (monitoring/review/shutdown), BMAD Dispatch (agent selection), Model Dispatch (optional multi-provider LLM routing)
Constraints GC-19 and GC-20: Agents must be spawned as teammates (not standalone subagents). BMAD activation and instruction must be separate messages.
Updated identity: Parzival now activates and manages all agents directly — the user interacts with Parzival only, Parzival manages everything else.
Installer updates:
generate_parzival_skill_shims()creates shims dynamically from skill frontmatter.setup_model_dispatch()provides optional multi-provider setup. Stale 2.0 files cleaned up automatically.Documentation
Files Changed
_ai-memory/pov/,.claude/skills/,docs/parzival/,scripts/install.shTest Plan
/pov:parzivalactivates with 15-item menuteams_enabled: truein installed config