From 0efd2329bb2a9a47d26f8671d8378321e14cffb6 Mon Sep 17 00:00:00 2001 From: Seth Ford Date: Mon, 9 Mar 2026 06:50:38 -0400 Subject: [PATCH 1/8] feat: add minimal pipeline template for trivial single-file fixes MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - New `minimal` template: intake → build → test → PR with no quality gates, no intelligence, haiku model, and max 3 build iterations - Project detection recommends `minimal` for projects with ≤5 source files - Daemon triage selects `minimal` for score ≥90 (very simple issues) - Updated CLAUDE.md pipeline template tables - Added to upgrade file list for `shipwright upgrade` - Updated tests for project-detect and daemon-triage Closes #122 Co-Authored-By: Claude Opus 4.6 --- .claude/CLAUDE.md | 562 ++++++++++++++------------- scripts/lib/daemon-triage.sh | 4 +- scripts/lib/project-detect.sh | 8 +- scripts/sw-lib-daemon-triage-test.sh | 4 + scripts/sw-project-detect-test.sh | 18 +- scripts/sw-upgrade.sh | 1 + templates/pipelines/minimal.json | 74 ++++ 7 files changed, 389 insertions(+), 282 deletions(-) create mode 100644 templates/pipelines/minimal.json diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index fde86cf0b..a1155cf55 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -201,16 +201,17 @@ The build stage delegates to `shipwright loop` for autonomous multi-iteration de ## Pipeline Templates -| Template | Stages | Gates | Use Case | -| ------------ | ------------------------------------------ | --------------------------------- | ------------------------ | -| `fast` | intake → build → test → PR | all auto | Quick fixes | -| `standard` | intake → plan → build → test → review → PR | approve: plan, review, pr | Normal feature work | -| `full` | all stages | approve: plan, review, pr, deploy | Production deployment | -| `hotfix` | intake → build → test → PR | all auto | Urgent production fixes | -| `autonomous` | all stages | all auto | Daemon-driven delivery | -| `enterprise` | all stages | all approve, auto-rollback | Maximum safety | -| `cost-aware` | all stages | all auto, budget checks | Budget-limited delivery | -| `deployed` | all + deploy + validate + monitor | approve: deploy | Full deploy + monitoring | +| Template | Stages | Gates | Use Case | +| ------------ | ------------------------------------------ | --------------------------------- | ------------------------- | +| `minimal` | intake → build → test → PR | all auto | Trivial single-file fixes | +| `fast` | intake → build → test → PR | all auto | Quick fixes | +| `standard` | intake → plan → build → test → review → PR | approve: plan, review, pr | Normal feature work | +| `full` | all stages | approve: plan, review, pr, deploy | Production deployment | +| `hotfix` | intake → build → test → PR | all auto | Urgent production fixes | +| `autonomous` | all stages | all auto | Daemon-driven delivery | +| `enterprise` | all stages | all approve, auto-rollback | Maximum safety | +| `cost-aware` | all stages | all auto, budget checks | Budget-limited delivery | +| `deployed` | all + deploy + validate + monitor | approve: deploy | Full deploy + monitoring | ## CLI Flags @@ -508,133 +509,136 @@ All scripts are bash (except the dashboard server in TypeScript). Grouped by lay -| File | Lines | Purpose | -| --- | ---: | --- | -| `scripts/sw-activity.sh` | 480 | Live agent activity stream | -| `scripts/sw-adaptive.sh` | 941 | data-driven pipeline tuning | -| `scripts/sw-adversarial.sh` | 259 | Adversarial Agent Code Review | -| `scripts/sw-architecture-enforcer.sh` | 319 | Living Architecture Model & Enforcer | -| `scripts/sw-auth.sh` | 610 | GitHub OAuth Authentication | -| `scripts/sw-autonomous.sh` | 1057 | Master controller for AI-building-AI loop | -| `scripts/sw-changelog.sh` | 696 | Automated Release Notes & Migration Guides | -| `scripts/sw-checkpoint.sh` | 605 | Save and restore agent state mid-stage | -| `scripts/sw-ci.sh` | 589 | GitHub Actions CI/CD Orchestration | -| `scripts/sw-cleanup.sh` | 350 | Clean up orphaned Claude team sessions & artifacts | -| `scripts/sw-code-review.sh` | 697 | Clean Code & Architecture Analysis | -| `scripts/sw-connect.sh` | 624 | Sync local state to team dashboard | -| `scripts/sw-context.sh` | 600 | Context Engine for Pipeline Stages | -| `scripts/sw-cost.sh` | 1013 | Token Usage & Cost Intelligence | -| `scripts/sw-daemon.sh` | 1416 | Autonomous GitHub Issue Watcher | -| `scripts/sw-dashboard.sh` | 510 | Fleet Command Dashboard | -| `scripts/sw-db.sh` | 1939 | SQLite Persistence Layer | -| `scripts/sw-decide.sh` | 691 | Shipwright Autonomous Decision Engine | -| `scripts/sw-decompose.sh` | 864 | Intelligent Issue Decomposition | -| `scripts/sw-deps.sh` | 533 | Automated Dependency Update Management | -| `scripts/sw-developer-simulation.sh` | 239 | Multi-Persona Developer Simulation | -| `scripts/sw-discovery.sh` | 910 | Cross-Pipeline Real-Time Learning | -| `scripts/sw-doc-fleet.sh` | 815 | Documentation Fleet Orchestrator | -| `scripts/sw-docs-agent.sh` | 525 | Auto-sync README, wiki, API docs | -| `scripts/sw-docs.sh` | 626 | Documentation Keeper | -| `scripts/sw-doctor.sh` | 1635 | Validate Shipwright setup | -| `scripts/sw-dora.sh` | 605 | DORA Metrics Dashboard with Engineering Intelligence | -| `scripts/sw-durable.sh` | 708 | Durable Workflow Engine | -| `scripts/sw-e2e-orchestrator.sh` | 535 | Test suite registry & execution | -| `scripts/sw-eventbus.sh` | 415 | Durable event bus for real-time inter-component | -| `scripts/sw-evidence.sh` | 1100 | Machine-Verifiable Proof for Agent Deliveries | -| `scripts/sw-feedback.sh` | 999 | Production Feedback Loop | -| `scripts/sw-fix.sh` | 474 | Bulk Fix Across Multiple Repos | -| `scripts/sw-fleet-discover.sh` | 550 | Auto-Discovery from GitHub Orgs | -| `scripts/sw-fleet-viz.sh` | 411 | Multi-Repo Fleet Visualization | -| `scripts/sw-fleet.sh` | 1377 | Multi-Repo Daemon Orchestrator | -| `scripts/sw-guild.sh` | 556 | Knowledge Guilds & Cross-Team Learning | -| `scripts/sw-heartbeat.sh` | 316 | File-based agent heartbeat protocol | -| `scripts/sw-hello.sh` | 67 | Hello World Command | -| `scripts/sw-hygiene.sh` | 660 | Repository Organization & Cleanup | -| `scripts/sw-incident.sh` | 1132 | Autonomous Incident Detection & Response | -| `scripts/sw-init.sh` | 869 | Complete setup for Shipwright + Shipwright | -| `scripts/sw-instrument.sh` | 691 | Pipeline Instrumentation & Feedback Loops | -| `scripts/sw-intelligence.sh` | 1547 | AI-Powered Analysis & Decision Engine | -| `scripts/sw-jira.sh` | 628 | Jira ↔ GitHub Bidirectional Sync | -| `scripts/sw-launchd.sh` | 703 | Process supervision (macOS + Linux) | -| `scripts/sw-linear.sh` | 643 | Linear ↔ GitHub Bidirectional Sync | -| `scripts/sw-logs.sh` | 353 | View and search agent pane logs | -| `scripts/sw-loop.sh` | 2527 | Continuous agent loop harness for Claude Code | -| `scripts/sw-memory.sh` | 2118 | Persistent Learning & Context System | -| `scripts/sw-mission-control.sh` | 473 | Terminal-based pipeline mission control | -| `scripts/sw-model-router.sh` | 1023 | Intelligent Model Routing & Cost Optimization | -| `scripts/sw-otel.sh` | 609 | OpenTelemetry Observability | -| `scripts/sw-oversight.sh` | 757 | Quality Oversight Board | -| `scripts/sw-patrol-meta.sh` | 776 | Shipwright Self-Improvement Patrol | -| `scripts/sw-pipeline-composer.sh` | 444 | Dynamic Pipeline Composition | -| `scripts/sw-pipeline-vitals.sh` | 1076 | Pipeline Vitals Engine | -| `scripts/sw-pipeline.sh` | 266 | Autonomous Feature Delivery (Idea → Production) | -| `scripts/sw-pm.sh` | 749 | Autonomous PM Agent for Team Orchestration | -| `scripts/sw-pr-lifecycle.sh` | 688 | Autonomous PR Management | -| `scripts/sw-predictive.sh` | 834 | Predictive & Proactive Intelligence | -| `scripts/sw-prep.sh` | 1675 | Repository Preparation for Agent Teams | -| `scripts/sw-ps.sh` | 156 | Show running agent process status | -| `scripts/sw-public-dashboard.sh` | 797 | Public real-time pipeline progress | -| `scripts/sw-quality.sh` | 676 | Intelligent completion, audits, zero auto | -| `scripts/sw-reaper.sh` | 384 | Automatic tmux pane cleanup when agents exit | -| `scripts/sw-recruit.sh` | 495 | AGI-Level Agent Recruitment & Talent Management | -| `scripts/sw-regression.sh` | 632 | Regression Detection Pipeline | -| `scripts/sw-release-manager.sh` | 721 | Autonomous Release Pipeline | -| `scripts/sw-release.sh` | 701 | Release train automation | -| `scripts/sw-remote.sh` | 670 | Machine Registry & Remote Daemon Management | -| `scripts/sw-replay.sh` | 542 | Pipeline run replay, timeline viewing, narratives | -| `scripts/sw-retro.sh` | 820 | Sprint Retrospective Engine | -| `scripts/sw-review-rerun.sh` | 222 | Canonical Rerun Comment Writer | -| `scripts/sw-scale.sh` | 609 | Dynamic agent team scaling during pipeline execution | -| `scripts/sw-security-audit.sh` | 510 | Comprehensive Security Auditing | -| `scripts/sw-self-optimize.sh` | 1690 | Learning & Self-Tuning System | -| `scripts/sw-session.sh` | 553 | Launch a Claude Code team session in a new tmux window | -| `scripts/sw-setup.sh` | 376 | Comprehensive onboarding wizard | -| `scripts/sw-stall-detector.sh` | 406 | Pipeline Stall & Deadlock Detection | -| `scripts/sw-standup.sh` | 721 | Automated Daily Standups for AI Agent Teams | -| `scripts/sw-status.sh` | 869 | Dashboard showing Claude Code team status | -| `scripts/sw-strategic.sh` | 943 | Strategic Intelligence Agent | -| `scripts/sw-stream.sh` | 451 | Live terminal output streaming from agent panes | -| `scripts/sw-swarm.sh` | 684 | Dynamic agent swarm management | -| `scripts/sw-team-stages.sh` | 500 | Multi-agent execution with leader/specialist roles | -| `scripts/sw-templates.sh` | 228 | Browse and inspect team templates | -| `scripts/sw-testgen.sh` | 567 | Autonomous test generation and coverage maintenance | -| `scripts/sw-tmux-pipeline.sh` | 538 | Spawn and manage pipelines in tmux windows | -| `scripts/sw-tmux-role-color.sh` | 81 | Set pane border color by agent role | -| `scripts/sw-tmux-status.sh` | 151 | Status bar widgets for tmux | -| `scripts/sw-tmux.sh` | 625 | tmux Health & Plugin Management | -| `scripts/sw-trace.sh` | 480 | E2E Traceability (Issue → Commit → PR → Deploy) | -| `scripts/sw-tracker.sh` | 517 | Provider Router for Issue Tracker Integration | -| `scripts/sw-triage.sh` | 812 | Intelligent Issue Labeling & Prioritization | -| `scripts/sw-upgrade.sh` | 477 | Detect and apply updates from the repo | -| `scripts/sw-ux.sh` | 685 | Premium UX Enhancement Layer | -| `scripts/sw-webhook.sh` | 621 | GitHub Webhook Receiver for Instant Issue Processing | -| `scripts/sw-widgets.sh` | 528 | Embeddable Status Widgets | -| `scripts/sw-worktree.sh` | 421 | Git worktree management for multi-agent isolation | -| `scripts/sw` | 620 | CLI router — dispatches subcommands via exec | +| File | Lines | Purpose | +| ------------------------------------- | ----: | ------------------------------------------------------ | +| `scripts/sw-activity.sh` | 480 | Live agent activity stream | +| `scripts/sw-adaptive.sh` | 941 | data-driven pipeline tuning | +| `scripts/sw-adversarial.sh` | 259 | Adversarial Agent Code Review | +| `scripts/sw-architecture-enforcer.sh` | 319 | Living Architecture Model & Enforcer | +| `scripts/sw-auth.sh` | 610 | GitHub OAuth Authentication | +| `scripts/sw-autonomous.sh` | 1057 | Master controller for AI-building-AI loop | +| `scripts/sw-changelog.sh` | 696 | Automated Release Notes & Migration Guides | +| `scripts/sw-checkpoint.sh` | 605 | Save and restore agent state mid-stage | +| `scripts/sw-ci.sh` | 589 | GitHub Actions CI/CD Orchestration | +| `scripts/sw-cleanup.sh` | 350 | Clean up orphaned Claude team sessions & artifacts | +| `scripts/sw-code-review.sh` | 697 | Clean Code & Architecture Analysis | +| `scripts/sw-connect.sh` | 624 | Sync local state to team dashboard | +| `scripts/sw-context.sh` | 600 | Context Engine for Pipeline Stages | +| `scripts/sw-cost.sh` | 1013 | Token Usage & Cost Intelligence | +| `scripts/sw-daemon.sh` | 1416 | Autonomous GitHub Issue Watcher | +| `scripts/sw-dashboard.sh` | 510 | Fleet Command Dashboard | +| `scripts/sw-db.sh` | 1939 | SQLite Persistence Layer | +| `scripts/sw-decide.sh` | 691 | Shipwright Autonomous Decision Engine | +| `scripts/sw-decompose.sh` | 864 | Intelligent Issue Decomposition | +| `scripts/sw-deps.sh` | 533 | Automated Dependency Update Management | +| `scripts/sw-developer-simulation.sh` | 239 | Multi-Persona Developer Simulation | +| `scripts/sw-discovery.sh` | 910 | Cross-Pipeline Real-Time Learning | +| `scripts/sw-doc-fleet.sh` | 815 | Documentation Fleet Orchestrator | +| `scripts/sw-docs-agent.sh` | 525 | Auto-sync README, wiki, API docs | +| `scripts/sw-docs.sh` | 626 | Documentation Keeper | +| `scripts/sw-doctor.sh` | 1635 | Validate Shipwright setup | +| `scripts/sw-dora.sh` | 605 | DORA Metrics Dashboard with Engineering Intelligence | +| `scripts/sw-durable.sh` | 708 | Durable Workflow Engine | +| `scripts/sw-e2e-orchestrator.sh` | 535 | Test suite registry & execution | +| `scripts/sw-eventbus.sh` | 415 | Durable event bus for real-time inter-component | +| `scripts/sw-evidence.sh` | 1100 | Machine-Verifiable Proof for Agent Deliveries | +| `scripts/sw-feedback.sh` | 999 | Production Feedback Loop | +| `scripts/sw-fix.sh` | 474 | Bulk Fix Across Multiple Repos | +| `scripts/sw-fleet-discover.sh` | 550 | Auto-Discovery from GitHub Orgs | +| `scripts/sw-fleet-viz.sh` | 411 | Multi-Repo Fleet Visualization | +| `scripts/sw-fleet.sh` | 1377 | Multi-Repo Daemon Orchestrator | +| `scripts/sw-guild.sh` | 556 | Knowledge Guilds & Cross-Team Learning | +| `scripts/sw-heartbeat.sh` | 316 | File-based agent heartbeat protocol | +| `scripts/sw-hello.sh` | 67 | Hello World Command | +| `scripts/sw-hygiene.sh` | 660 | Repository Organization & Cleanup | +| `scripts/sw-incident.sh` | 1132 | Autonomous Incident Detection & Response | +| `scripts/sw-init.sh` | 869 | Complete setup for Shipwright + Shipwright | +| `scripts/sw-instrument.sh` | 691 | Pipeline Instrumentation & Feedback Loops | +| `scripts/sw-intelligence.sh` | 1547 | AI-Powered Analysis & Decision Engine | +| `scripts/sw-jira.sh` | 628 | Jira ↔ GitHub Bidirectional Sync | +| `scripts/sw-launchd.sh` | 703 | Process supervision (macOS + Linux) | +| `scripts/sw-linear.sh` | 643 | Linear ↔ GitHub Bidirectional Sync | +| `scripts/sw-logs.sh` | 353 | View and search agent pane logs | +| `scripts/sw-loop.sh` | 2527 | Continuous agent loop harness for Claude Code | +| `scripts/sw-memory.sh` | 2118 | Persistent Learning & Context System | +| `scripts/sw-mission-control.sh` | 473 | Terminal-based pipeline mission control | +| `scripts/sw-model-router.sh` | 1023 | Intelligent Model Routing & Cost Optimization | +| `scripts/sw-otel.sh` | 609 | OpenTelemetry Observability | +| `scripts/sw-oversight.sh` | 757 | Quality Oversight Board | +| `scripts/sw-patrol-meta.sh` | 776 | Shipwright Self-Improvement Patrol | +| `scripts/sw-pipeline-composer.sh` | 444 | Dynamic Pipeline Composition | +| `scripts/sw-pipeline-vitals.sh` | 1076 | Pipeline Vitals Engine | +| `scripts/sw-pipeline.sh` | 266 | Autonomous Feature Delivery (Idea → Production) | +| `scripts/sw-pm.sh` | 749 | Autonomous PM Agent for Team Orchestration | +| `scripts/sw-pr-lifecycle.sh` | 688 | Autonomous PR Management | +| `scripts/sw-predictive.sh` | 834 | Predictive & Proactive Intelligence | +| `scripts/sw-prep.sh` | 1675 | Repository Preparation for Agent Teams | +| `scripts/sw-ps.sh` | 156 | Show running agent process status | +| `scripts/sw-public-dashboard.sh` | 797 | Public real-time pipeline progress | +| `scripts/sw-quality.sh` | 676 | Intelligent completion, audits, zero auto | +| `scripts/sw-reaper.sh` | 384 | Automatic tmux pane cleanup when agents exit | +| `scripts/sw-recruit.sh` | 495 | AGI-Level Agent Recruitment & Talent Management | +| `scripts/sw-regression.sh` | 632 | Regression Detection Pipeline | +| `scripts/sw-release-manager.sh` | 721 | Autonomous Release Pipeline | +| `scripts/sw-release.sh` | 701 | Release train automation | +| `scripts/sw-remote.sh` | 670 | Machine Registry & Remote Daemon Management | +| `scripts/sw-replay.sh` | 542 | Pipeline run replay, timeline viewing, narratives | +| `scripts/sw-retro.sh` | 820 | Sprint Retrospective Engine | +| `scripts/sw-review-rerun.sh` | 222 | Canonical Rerun Comment Writer | +| `scripts/sw-scale.sh` | 609 | Dynamic agent team scaling during pipeline execution | +| `scripts/sw-security-audit.sh` | 510 | Comprehensive Security Auditing | +| `scripts/sw-self-optimize.sh` | 1690 | Learning & Self-Tuning System | +| `scripts/sw-session.sh` | 553 | Launch a Claude Code team session in a new tmux window | +| `scripts/sw-setup.sh` | 376 | Comprehensive onboarding wizard | +| `scripts/sw-stall-detector.sh` | 406 | Pipeline Stall & Deadlock Detection | +| `scripts/sw-standup.sh` | 721 | Automated Daily Standups for AI Agent Teams | +| `scripts/sw-status.sh` | 869 | Dashboard showing Claude Code team status | +| `scripts/sw-strategic.sh` | 943 | Strategic Intelligence Agent | +| `scripts/sw-stream.sh` | 451 | Live terminal output streaming from agent panes | +| `scripts/sw-swarm.sh` | 684 | Dynamic agent swarm management | +| `scripts/sw-team-stages.sh` | 500 | Multi-agent execution with leader/specialist roles | +| `scripts/sw-templates.sh` | 228 | Browse and inspect team templates | +| `scripts/sw-testgen.sh` | 567 | Autonomous test generation and coverage maintenance | +| `scripts/sw-tmux-pipeline.sh` | 538 | Spawn and manage pipelines in tmux windows | +| `scripts/sw-tmux-role-color.sh` | 81 | Set pane border color by agent role | +| `scripts/sw-tmux-status.sh` | 151 | Status bar widgets for tmux | +| `scripts/sw-tmux.sh` | 625 | tmux Health & Plugin Management | +| `scripts/sw-trace.sh` | 480 | E2E Traceability (Issue → Commit → PR → Deploy) | +| `scripts/sw-tracker.sh` | 517 | Provider Router for Issue Tracker Integration | +| `scripts/sw-triage.sh` | 812 | Intelligent Issue Labeling & Prioritization | +| `scripts/sw-upgrade.sh` | 477 | Detect and apply updates from the repo | +| `scripts/sw-ux.sh` | 685 | Premium UX Enhancement Layer | +| `scripts/sw-webhook.sh` | 621 | GitHub Webhook Receiver for Instant Issue Processing | +| `scripts/sw-widgets.sh` | 528 | Embeddable Status Widgets | +| `scripts/sw-worktree.sh` | 421 | Git worktree management for multi-agent isolation | +| `scripts/sw` | 620 | CLI router — dispatches subcommands via exec | + ### GitHub API Modules -| File | Lines | Purpose | -| --- | ---: | --- | -| `scripts/sw-github-app.sh` | 592 | GitHub App Management & Webhook Receiver | -| `scripts/sw-github-checks.sh` | 501 | Native GitHub Checks API Integration | -| `scripts/sw-github-deploy.sh` | 513 | Native GitHub Deployments API Integration | -| `scripts/sw-github-graphql.sh` | 965 | GitHub GraphQL API Client | +| File | Lines | Purpose | +| ------------------------------ | ----: | ----------------------------------------- | +| `scripts/sw-github-app.sh` | 592 | GitHub App Management & Webhook Receiver | +| `scripts/sw-github-checks.sh` | 501 | Native GitHub Checks API Integration | +| `scripts/sw-github-deploy.sh` | 513 | Native GitHub Deployments API Integration | +| `scripts/sw-github-graphql.sh` | 965 | GitHub GraphQL API Client | + ### Issue Tracker Adapters -| File | Lines | Purpose | -| --- | ---: | --- | -| `scripts/sw-linear.sh` | 643 | Linear ↔ GitHub Bidirectional Sync | -| `scripts/sw-jira.sh` | 628 | Jira ↔ GitHub Bidirectional Sync | -| `scripts/sw-tracker-linear.sh` | 568 | do not call directly | -| `scripts/sw-tracker-jira.sh` | 474 | do not call directly | +| File | Lines | Purpose | +| ------------------------------ | ----: | ---------------------------------- | +| `scripts/sw-linear.sh` | 643 | Linear ↔ GitHub Bidirectional Sync | +| `scripts/sw-jira.sh` | 628 | Jira ↔ GitHub Bidirectional Sync | +| `scripts/sw-tracker-linear.sh` | 568 | do not call directly | +| `scripts/sw-tracker-jira.sh` | 474 | do not call directly | + ### Shared Libraries @@ -647,151 +651,152 @@ All scripts are bash (except the dashboard server in TypeScript). Grouped by lay -| File | Lines | Purpose | -| --- | ---: | --- | -| `scripts/sw-activity-test.sh` | 219 | Validate live agent activity stream | -| `scripts/sw-adapters-test.sh` | 197 | Structural/smoke tests for terminal & deploy | -| `scripts/sw-adaptive-model-test.sh` | 399 | Test Suite for Adaptive Model Selection | -| `scripts/sw-adaptive-test.sh` | 206 | Validate data-driven pipeline tuning | -| `scripts/sw-adaptive-timeout-test.sh` | 406 | Test Suite for Adaptive Stage Timeout Engine | -| `scripts/sw-adversarial-test.sh` | 258 | Validate adversarial agent code review | -| `scripts/sw-agi-roadmap-test.sh` | 857 | Tests every feature we implemented | -| `scripts/sw-architecture-enforcer-test.sh` | 301 | Validate architecture model | -| `scripts/sw-auth-test.sh` | 141 | Validate OAuth authentication commands | -| `scripts/sw-autonomous-e2e-test.sh` | 292 | Autonomous Loop E2E Test | -| `scripts/sw-autonomous-test.sh` | 207 | AI-building-AI master controller tests | -| `scripts/sw-budget-chaos-test.sh` | 251 | Budget Exhaustion & Chaos Tests | -| `scripts/sw-changelog-test.sh` | 201 | Validate release notes generation | -| `scripts/sw-chaos-test.sh` | 384 | Fault injection & recovery validation | -| `scripts/sw-checkpoint-test.sh` | 341 | Validate checkpoint save/restore | -| `scripts/sw-ci-test.sh` | 198 | GitHub Actions CI/CD orchestration tests | -| `scripts/sw-cleanup-test.sh` | 168 | Clean up orphaned sessions & artifacts | -| `scripts/sw-code-review-test.sh` | 173 | Clean code & architecture analysis tests | -| `scripts/sw-connect-test.sh` | 822 | Validate dashboard connection, heartbeat | -| `scripts/sw-context-budget-test.sh` | 335 | Context Window Budget Monitor tests | -| `scripts/sw-context-test.sh` | 219 | Context Engine for Pipeline Stages tests | -| `scripts/sw-convergence-test.sh` | 324 | Unit tests for convergence detection | -| `scripts/sw-cost-optimizer-test.sh` | 466 | Test suite for cost optimization | -| `scripts/sw-cost-test.sh` | 234 | Validate token usage & cost intelligence | -| `scripts/sw-daemon-test.sh` | 1985 | Unit tests for daemon metrics, health, alerting | -| `scripts/sw-dashboard-e2e-test.sh` | 591 | full live validation | -| `scripts/sw-dashboard-test.sh` | 250 | validates dashboard structure | -| `scripts/sw-db-test.sh` | 971 | SQLite Persistence Layer Test Suite | -| `scripts/sw-decide-test.sh` | 519 | Unit tests for the Autonomous Decision Engine | -| `scripts/sw-decompose-test.sh` | 220 | Intelligent Issue Decomposition tests | -| `scripts/sw-deps-test.sh` | 165 | Automated Dependency Update Management tests | -| `scripts/sw-developer-simulation-test.sh` | 262 | Validate multi-persona | -| `scripts/sw-discovery-test.sh` | 268 | Cross-Pipeline Real-Time Learning tests | -| `scripts/sw-doc-fleet-test.sh` | 344 | Validate documentation fleet operations | -| `scripts/sw-docs-agent-test.sh` | 182 | Validate documentation agent operations | -| `scripts/sw-docs-test.sh` | 781 | Validate documentation keeper, AUTO sections, | -| `scripts/sw-doctor-test.sh` | 420 | Validate setup diagnostics | -| `scripts/sw-dora-test.sh` | 241 | Validate DORA metrics dashboard, DX metrics, | -| `scripts/sw-durable-test.sh` | 221 | Validate durable workflow engine | -| `scripts/sw-e2e-integration-test.sh` | 352 | Real Claude + Real GitHub | -| `scripts/sw-e2e-orchestrator-test.sh` | 157 | Test suite registry & execution | -| `scripts/sw-e2e-smoke-test.sh` | 835 | Pipeline orchestration without API keys | -| `scripts/sw-e2e-system-test.sh` | 465 | Proves full daemon→pipeline→loop→PR flow | -| `scripts/sw-eventbus-test.sh` | 155 | Durable event bus tests | -| `scripts/sw-evidence-test.sh` | 416 | Unit tests for sw-evidence.sh | -| `scripts/sw-feedback-test.sh` | 302 | Production Feedback Loop tests | -| `scripts/sw-fix-test.sh` | 619 | Unit tests for bulk fix across repos | -| `scripts/sw-fleet-discover-test.sh` | 274 | Validate GitHub org auto-discovery, | -| `scripts/sw-fleet-test.sh` | 822 | Unit tests for fleet orchestration | -| `scripts/sw-fleet-viz-test.sh` | 278 | Validate fleet visualization dashboard, | -| `scripts/sw-frontier-test.sh` | 574 | Validate adversarial review, developer | -| `scripts/sw-github-app-test.sh` | 145 | Validate GitHub App management | -| `scripts/sw-github-checks-test.sh` | 535 | Validate Checks API wrapper | -| `scripts/sw-github-deploy-test.sh` | 523 | Validate Deployments API wrapper | -| `scripts/sw-github-graphql-test.sh` | 661 | Unit tests for GitHub GraphQL client | -| `scripts/sw-guild-test.sh` | 149 | Knowledge guilds & cross-team learning tests | -| `scripts/sw-heartbeat-test.sh` | 581 | Validate heartbeat lifecycle, | -| `scripts/sw-hello-test.sh` | 108 | Hello Command Test Suite | -| `scripts/sw-hygiene-test.sh` | 198 | Repository Organization & Cleanup tests | -| `scripts/sw-incident-test.sh` | 434 | Validate incident detection & response | -| `scripts/sw-init-test.sh` | 645 | E2E validation of init/setup flow | -| `scripts/sw-instrument-test.sh` | 172 | Pipeline instrumentation & feedback loops | -| `scripts/sw-integration-claude-test.sh` | 63 | Budget-limited real Claude smoke | -| `scripts/sw-intelligence-test.sh` | 534 | Unit tests for intelligence core | -| `scripts/sw-jira-test.sh` | 284 | Validate Jira ↔ GitHub bidirectional sync | -| `scripts/sw-launchd-test.sh` | 899 | Validate service management on | -| `scripts/sw-lib-audit-trail-test.sh` | 311 | | -| `scripts/sw-lib-compat-test.sh` | 297 | Unit tests for cross-platform helpers | -| `scripts/sw-lib-compound-audit-test.sh` | 281 | | -| `scripts/sw-lib-daemon-dispatch-test.sh` | 415 | Unit tests for spawn/reap/queue | -| `scripts/sw-lib-daemon-failure-test.sh` | 213 | Unit tests for failure handling | -| `scripts/sw-lib-daemon-patrol-test.sh` | 343 | Unit tests for all patrol functions | -| `scripts/sw-lib-daemon-poll-test.sh` | 344 | Unit tests for poll, health, cleanup | -| `scripts/sw-lib-daemon-state-test.sh` | 383 | Unit tests for state management | -| `scripts/sw-lib-daemon-triage-test.sh` | 267 | Unit tests for triage scoring | -| `scripts/sw-lib-error-actionability-test.sh` | 213 | | -| `scripts/sw-lib-helpers-test.sh` | 229 | Unit tests for shared helper functions | -| `scripts/sw-lib-pipeline-detection-test.sh` | 391 | Unit tests for detection fns | -| `scripts/sw-lib-pipeline-intelligence-test.sh` | 410 | Unit tests for intelligence | -| `scripts/sw-lib-pipeline-quality-checks-test.sh` | 193 | Unit tests for quality | -| `scripts/sw-lib-pipeline-stages-test.sh` | 290 | Unit tests for stage functions | -| `scripts/sw-lib-pipeline-state-test.sh` | 309 | Unit tests for pipeline state | -| `scripts/sw-linear-test.sh` | 300 | Validate Linear ↔ GitHub bidirectional sync | -| `scripts/sw-logs-test.sh` | 281 | Validate agent pane log viewing, searching, | -| `scripts/sw-loop-test.sh` | 816 | Validate continuous agent loop harness | -| `scripts/sw-memory-discovery-e2e-test.sh` | 411 | Memory & Discovery E2E Test | -| `scripts/sw-memory-effectiveness-test.sh` | 495 | Unit tests | -| `scripts/sw-memory-test.sh` | 871 | Unit tests for memory system & cost tracking | -| `scripts/sw-mission-control-test.sh` | 153 | Validate mission control dashboard | -| `scripts/sw-model-router-test.sh` | 303 | Intelligent model routing & optimization | -| `scripts/sw-otel-test.sh` | 146 | OpenTelemetry observability | -| `scripts/sw-oversight-test.sh` | 164 | Quality oversight board tests | -| `scripts/sw-patrol-meta-test.sh` | 449 | Validate self-improvement patrol | -| `scripts/sw-pipeline-composer-test.sh` | 632 | Test Suite | -| `scripts/sw-pipeline-test.sh` | 1908 | E2E validation invoking the REAL pipeline | -| `scripts/sw-pipeline-vitals-test.sh` | 226 | Validate pipeline health scoring | -| `scripts/sw-pm-test.sh` | 225 | Autonomous PM Agent test suite | -| `scripts/sw-policy-e2e-test.sh` | 290 | Verify config/policy.json is honored | -| `scripts/sw-pr-lifecycle-test.sh` | 317 | Validate autonomous PR management | -| `scripts/sw-predictive-test.sh` | 691 | Unit tests for predictive intelligence | -| `scripts/sw-prep-test.sh` | 636 | Validate repo preparation | -| `scripts/sw-project-detect-test.sh` | 434 | Unit tests for project detection | -| `scripts/sw-ps-test.sh` | 296 | Validate agent process status display | -| `scripts/sw-public-dashboard-test.sh` | 165 | Validate public dashboard generation | -| `scripts/sw-quality-test.sh` | 227 | Validate ruthless quality validation engine | -| `scripts/sw-reaper-test.sh` | 232 | Validate automatic tmux pane cleanup | -| `scripts/sw-recruit-test.sh` | 1395 | Test suite for AGI-level agent recruitment system | -| `scripts/sw-regression-test.sh` | 258 | Validate regression detection pipeline | -| `scripts/sw-release-manager-test.sh` | 206 | Validate release pipeline | -| `scripts/sw-release-test.sh` | 200 | Release train automation | -| `scripts/sw-remote-test.sh` | 396 | Validate machine registry, atomic writes, | -| `scripts/sw-replay-test.sh` | 167 | Pipeline run replay & timeline viewing | -| `scripts/sw-retro-test.sh` | 171 | Sprint retrospective engine tests | -| `scripts/sw-review-rerun-test.sh` | 317 | SHA-deduped rerun comment writer | -| `scripts/sw-root-cause-test.sh` | 374 | | -| `scripts/sw-scale-test.sh` | 151 | Dynamic agent team scaling | -| `scripts/sw-security-audit-test.sh` | 162 | Security auditing tests | -| `scripts/sw-self-optimize-test.sh` | 837 | Unit tests for learning & tuning system | -| `scripts/sw-server-api-test.sh` | 713 | Dashboard Server API Test Suite | -| `scripts/sw-session-restart-test.sh` | 520 | Intelligent restart briefing system | -| `scripts/sw-session-test.sh` | 586 | E2E validation of session creation flow | -| `scripts/sw-setup-test.sh` | 262 | Validate comprehensive onboarding wizard | -| `scripts/sw-stall-detector-test.sh` | 367 | Validate stall detection and abort | -| `scripts/sw-standup-test.sh` | 241 | Validate daily standup automation | -| `scripts/sw-status-test.sh` | 294 | Validate status dashboard and --json output | -| `scripts/sw-strategic-test.sh` | 214 | Validate strategic intelligence agent | -| `scripts/sw-stream-test.sh` | 140 | Live terminal output streaming | -| `scripts/sw-swarm-test.sh` | 153 | Dynamic agent swarm management tests | -| `scripts/sw-team-stages-test.sh` | 148 | Validate multi-agent stage execution | -| `scripts/sw-templates-test.sh` | 251 | Validate team template browser | -| `scripts/sw-test-optimizer-test.sh` | 395 | Test suite for test execution optimizer | -| `scripts/sw-testgen-test.sh` | 160 | Test generation & coverage tests | -| `scripts/sw-tmux-pipeline-test.sh` | 187 | Validate tmux pipeline management | -| `scripts/sw-tmux-test.sh` | 746 | Validate tmux doctor, install, fix, reload, | -| `scripts/sw-trace-test.sh` | 143 | E2E traceability (Issue → Commit → PR → Deploy) | -| `scripts/sw-tracker-providers-test.sh` | 552 | Unit tests for GitHub, Linear, | -| `scripts/sw-tracker-test.sh` | 534 | Validate tracker router, providers, and | -| `scripts/sw-triage-test.sh` | 242 | Intelligent Issue Labeling & Prioritization | -| `scripts/sw-upgrade-test.sh` | 334 | Validate upgrade detection and apply | -| `scripts/sw-ux-test.sh` | 185 | Validate UX enhancement layer | -| `scripts/sw-webhook-test.sh` | 167 | GitHub Webhook Receiver tests | -| `scripts/sw-widgets-test.sh` | 357 | Validate embeddable status widgets | -| `scripts/sw-worktree-test.sh` | 148 | Git worktree management for agent isolation | +| File | Lines | Purpose | +| ------------------------------------------------ | ----: | ------------------------------------------------- | +| `scripts/sw-activity-test.sh` | 219 | Validate live agent activity stream | +| `scripts/sw-adapters-test.sh` | 197 | Structural/smoke tests for terminal & deploy | +| `scripts/sw-adaptive-model-test.sh` | 399 | Test Suite for Adaptive Model Selection | +| `scripts/sw-adaptive-test.sh` | 206 | Validate data-driven pipeline tuning | +| `scripts/sw-adaptive-timeout-test.sh` | 406 | Test Suite for Adaptive Stage Timeout Engine | +| `scripts/sw-adversarial-test.sh` | 258 | Validate adversarial agent code review | +| `scripts/sw-agi-roadmap-test.sh` | 857 | Tests every feature we implemented | +| `scripts/sw-architecture-enforcer-test.sh` | 301 | Validate architecture model | +| `scripts/sw-auth-test.sh` | 141 | Validate OAuth authentication commands | +| `scripts/sw-autonomous-e2e-test.sh` | 292 | Autonomous Loop E2E Test | +| `scripts/sw-autonomous-test.sh` | 207 | AI-building-AI master controller tests | +| `scripts/sw-budget-chaos-test.sh` | 251 | Budget Exhaustion & Chaos Tests | +| `scripts/sw-changelog-test.sh` | 201 | Validate release notes generation | +| `scripts/sw-chaos-test.sh` | 384 | Fault injection & recovery validation | +| `scripts/sw-checkpoint-test.sh` | 341 | Validate checkpoint save/restore | +| `scripts/sw-ci-test.sh` | 198 | GitHub Actions CI/CD orchestration tests | +| `scripts/sw-cleanup-test.sh` | 168 | Clean up orphaned sessions & artifacts | +| `scripts/sw-code-review-test.sh` | 173 | Clean code & architecture analysis tests | +| `scripts/sw-connect-test.sh` | 822 | Validate dashboard connection, heartbeat | +| `scripts/sw-context-budget-test.sh` | 335 | Context Window Budget Monitor tests | +| `scripts/sw-context-test.sh` | 219 | Context Engine for Pipeline Stages tests | +| `scripts/sw-convergence-test.sh` | 324 | Unit tests for convergence detection | +| `scripts/sw-cost-optimizer-test.sh` | 466 | Test suite for cost optimization | +| `scripts/sw-cost-test.sh` | 234 | Validate token usage & cost intelligence | +| `scripts/sw-daemon-test.sh` | 1985 | Unit tests for daemon metrics, health, alerting | +| `scripts/sw-dashboard-e2e-test.sh` | 591 | full live validation | +| `scripts/sw-dashboard-test.sh` | 250 | validates dashboard structure | +| `scripts/sw-db-test.sh` | 971 | SQLite Persistence Layer Test Suite | +| `scripts/sw-decide-test.sh` | 519 | Unit tests for the Autonomous Decision Engine | +| `scripts/sw-decompose-test.sh` | 220 | Intelligent Issue Decomposition tests | +| `scripts/sw-deps-test.sh` | 165 | Automated Dependency Update Management tests | +| `scripts/sw-developer-simulation-test.sh` | 262 | Validate multi-persona | +| `scripts/sw-discovery-test.sh` | 268 | Cross-Pipeline Real-Time Learning tests | +| `scripts/sw-doc-fleet-test.sh` | 344 | Validate documentation fleet operations | +| `scripts/sw-docs-agent-test.sh` | 182 | Validate documentation agent operations | +| `scripts/sw-docs-test.sh` | 781 | Validate documentation keeper, AUTO sections, | +| `scripts/sw-doctor-test.sh` | 420 | Validate setup diagnostics | +| `scripts/sw-dora-test.sh` | 241 | Validate DORA metrics dashboard, DX metrics, | +| `scripts/sw-durable-test.sh` | 221 | Validate durable workflow engine | +| `scripts/sw-e2e-integration-test.sh` | 352 | Real Claude + Real GitHub | +| `scripts/sw-e2e-orchestrator-test.sh` | 157 | Test suite registry & execution | +| `scripts/sw-e2e-smoke-test.sh` | 835 | Pipeline orchestration without API keys | +| `scripts/sw-e2e-system-test.sh` | 465 | Proves full daemon→pipeline→loop→PR flow | +| `scripts/sw-eventbus-test.sh` | 155 | Durable event bus tests | +| `scripts/sw-evidence-test.sh` | 416 | Unit tests for sw-evidence.sh | +| `scripts/sw-feedback-test.sh` | 302 | Production Feedback Loop tests | +| `scripts/sw-fix-test.sh` | 619 | Unit tests for bulk fix across repos | +| `scripts/sw-fleet-discover-test.sh` | 274 | Validate GitHub org auto-discovery, | +| `scripts/sw-fleet-test.sh` | 822 | Unit tests for fleet orchestration | +| `scripts/sw-fleet-viz-test.sh` | 278 | Validate fleet visualization dashboard, | +| `scripts/sw-frontier-test.sh` | 574 | Validate adversarial review, developer | +| `scripts/sw-github-app-test.sh` | 145 | Validate GitHub App management | +| `scripts/sw-github-checks-test.sh` | 535 | Validate Checks API wrapper | +| `scripts/sw-github-deploy-test.sh` | 523 | Validate Deployments API wrapper | +| `scripts/sw-github-graphql-test.sh` | 661 | Unit tests for GitHub GraphQL client | +| `scripts/sw-guild-test.sh` | 149 | Knowledge guilds & cross-team learning tests | +| `scripts/sw-heartbeat-test.sh` | 581 | Validate heartbeat lifecycle, | +| `scripts/sw-hello-test.sh` | 108 | Hello Command Test Suite | +| `scripts/sw-hygiene-test.sh` | 198 | Repository Organization & Cleanup tests | +| `scripts/sw-incident-test.sh` | 434 | Validate incident detection & response | +| `scripts/sw-init-test.sh` | 645 | E2E validation of init/setup flow | +| `scripts/sw-instrument-test.sh` | 172 | Pipeline instrumentation & feedback loops | +| `scripts/sw-integration-claude-test.sh` | 63 | Budget-limited real Claude smoke | +| `scripts/sw-intelligence-test.sh` | 534 | Unit tests for intelligence core | +| `scripts/sw-jira-test.sh` | 284 | Validate Jira ↔ GitHub bidirectional sync | +| `scripts/sw-launchd-test.sh` | 899 | Validate service management on | +| `scripts/sw-lib-audit-trail-test.sh` | 311 | | +| `scripts/sw-lib-compat-test.sh` | 297 | Unit tests for cross-platform helpers | +| `scripts/sw-lib-compound-audit-test.sh` | 281 | | +| `scripts/sw-lib-daemon-dispatch-test.sh` | 415 | Unit tests for spawn/reap/queue | +| `scripts/sw-lib-daemon-failure-test.sh` | 213 | Unit tests for failure handling | +| `scripts/sw-lib-daemon-patrol-test.sh` | 343 | Unit tests for all patrol functions | +| `scripts/sw-lib-daemon-poll-test.sh` | 344 | Unit tests for poll, health, cleanup | +| `scripts/sw-lib-daemon-state-test.sh` | 383 | Unit tests for state management | +| `scripts/sw-lib-daemon-triage-test.sh` | 267 | Unit tests for triage scoring | +| `scripts/sw-lib-error-actionability-test.sh` | 213 | | +| `scripts/sw-lib-helpers-test.sh` | 229 | Unit tests for shared helper functions | +| `scripts/sw-lib-pipeline-detection-test.sh` | 391 | Unit tests for detection fns | +| `scripts/sw-lib-pipeline-intelligence-test.sh` | 410 | Unit tests for intelligence | +| `scripts/sw-lib-pipeline-quality-checks-test.sh` | 193 | Unit tests for quality | +| `scripts/sw-lib-pipeline-stages-test.sh` | 290 | Unit tests for stage functions | +| `scripts/sw-lib-pipeline-state-test.sh` | 309 | Unit tests for pipeline state | +| `scripts/sw-linear-test.sh` | 300 | Validate Linear ↔ GitHub bidirectional sync | +| `scripts/sw-logs-test.sh` | 281 | Validate agent pane log viewing, searching, | +| `scripts/sw-loop-test.sh` | 816 | Validate continuous agent loop harness | +| `scripts/sw-memory-discovery-e2e-test.sh` | 411 | Memory & Discovery E2E Test | +| `scripts/sw-memory-effectiveness-test.sh` | 495 | Unit tests | +| `scripts/sw-memory-test.sh` | 871 | Unit tests for memory system & cost tracking | +| `scripts/sw-mission-control-test.sh` | 153 | Validate mission control dashboard | +| `scripts/sw-model-router-test.sh` | 303 | Intelligent model routing & optimization | +| `scripts/sw-otel-test.sh` | 146 | OpenTelemetry observability | +| `scripts/sw-oversight-test.sh` | 164 | Quality oversight board tests | +| `scripts/sw-patrol-meta-test.sh` | 449 | Validate self-improvement patrol | +| `scripts/sw-pipeline-composer-test.sh` | 632 | Test Suite | +| `scripts/sw-pipeline-test.sh` | 1908 | E2E validation invoking the REAL pipeline | +| `scripts/sw-pipeline-vitals-test.sh` | 226 | Validate pipeline health scoring | +| `scripts/sw-pm-test.sh` | 225 | Autonomous PM Agent test suite | +| `scripts/sw-policy-e2e-test.sh` | 290 | Verify config/policy.json is honored | +| `scripts/sw-pr-lifecycle-test.sh` | 317 | Validate autonomous PR management | +| `scripts/sw-predictive-test.sh` | 691 | Unit tests for predictive intelligence | +| `scripts/sw-prep-test.sh` | 636 | Validate repo preparation | +| `scripts/sw-project-detect-test.sh` | 434 | Unit tests for project detection | +| `scripts/sw-ps-test.sh` | 296 | Validate agent process status display | +| `scripts/sw-public-dashboard-test.sh` | 165 | Validate public dashboard generation | +| `scripts/sw-quality-test.sh` | 227 | Validate ruthless quality validation engine | +| `scripts/sw-reaper-test.sh` | 232 | Validate automatic tmux pane cleanup | +| `scripts/sw-recruit-test.sh` | 1395 | Test suite for AGI-level agent recruitment system | +| `scripts/sw-regression-test.sh` | 258 | Validate regression detection pipeline | +| `scripts/sw-release-manager-test.sh` | 206 | Validate release pipeline | +| `scripts/sw-release-test.sh` | 200 | Release train automation | +| `scripts/sw-remote-test.sh` | 396 | Validate machine registry, atomic writes, | +| `scripts/sw-replay-test.sh` | 167 | Pipeline run replay & timeline viewing | +| `scripts/sw-retro-test.sh` | 171 | Sprint retrospective engine tests | +| `scripts/sw-review-rerun-test.sh` | 317 | SHA-deduped rerun comment writer | +| `scripts/sw-root-cause-test.sh` | 374 | | +| `scripts/sw-scale-test.sh` | 151 | Dynamic agent team scaling | +| `scripts/sw-security-audit-test.sh` | 162 | Security auditing tests | +| `scripts/sw-self-optimize-test.sh` | 837 | Unit tests for learning & tuning system | +| `scripts/sw-server-api-test.sh` | 713 | Dashboard Server API Test Suite | +| `scripts/sw-session-restart-test.sh` | 520 | Intelligent restart briefing system | +| `scripts/sw-session-test.sh` | 586 | E2E validation of session creation flow | +| `scripts/sw-setup-test.sh` | 262 | Validate comprehensive onboarding wizard | +| `scripts/sw-stall-detector-test.sh` | 367 | Validate stall detection and abort | +| `scripts/sw-standup-test.sh` | 241 | Validate daily standup automation | +| `scripts/sw-status-test.sh` | 294 | Validate status dashboard and --json output | +| `scripts/sw-strategic-test.sh` | 214 | Validate strategic intelligence agent | +| `scripts/sw-stream-test.sh` | 140 | Live terminal output streaming | +| `scripts/sw-swarm-test.sh` | 153 | Dynamic agent swarm management tests | +| `scripts/sw-team-stages-test.sh` | 148 | Validate multi-agent stage execution | +| `scripts/sw-templates-test.sh` | 251 | Validate team template browser | +| `scripts/sw-test-optimizer-test.sh` | 395 | Test suite for test execution optimizer | +| `scripts/sw-testgen-test.sh` | 160 | Test generation & coverage tests | +| `scripts/sw-tmux-pipeline-test.sh` | 187 | Validate tmux pipeline management | +| `scripts/sw-tmux-test.sh` | 746 | Validate tmux doctor, install, fix, reload, | +| `scripts/sw-trace-test.sh` | 143 | E2E traceability (Issue → Commit → PR → Deploy) | +| `scripts/sw-tracker-providers-test.sh` | 552 | Unit tests for GitHub, Linear, | +| `scripts/sw-tracker-test.sh` | 534 | Validate tracker router, providers, and | +| `scripts/sw-triage-test.sh` | 242 | Intelligent Issue Labeling & Prioritization | +| `scripts/sw-upgrade-test.sh` | 334 | Validate upgrade detection and apply | +| `scripts/sw-ux-test.sh` | 185 | Validate UX enhancement layer | +| `scripts/sw-webhook-test.sh` | 167 | GitHub Webhook Receiver tests | +| `scripts/sw-widgets-test.sh` | 357 | Validate embeddable status widgets | +| `scripts/sw-worktree-test.sh` | 148 | Git worktree management for agent isolation | + ### Dashboard & Infra @@ -862,14 +867,15 @@ Intelligence defaults to **auto** (enabled when Claude CLI is available). Config -| Flag | Default | Purpose | -| --- | --- | --- | -| `intelligence.cache_ttl_seconds` | `3600` | | -| `intelligence.adversarial_enabled` | `false` | | -| `intelligence.simulation_enabled` | `false` | | -| `intelligence.architecture_enabled` | `false` | | -| `intelligence.ab_test_ratio` | `0.2` | | -| `intelligence.anomaly_threshold` | `3.0` | | +| Flag | Default | Purpose | +| ----------------------------------- | ------- | ------- | +| `intelligence.cache_ttl_seconds` | `3600` | | +| `intelligence.adversarial_enabled` | `false` | | +| `intelligence.simulation_enabled` | `false` | | +| `intelligence.architecture_enabled` | `false` | | +| `intelligence.ab_test_ratio` | `0.2` | | +| `intelligence.anomaly_threshold` | `3.0` | | + ### Modules diff --git a/scripts/lib/daemon-triage.sh b/scripts/lib/daemon-triage.sh index 0dba1b4f4..04d7546fe 100755 --- a/scripts/lib/daemon-triage.sh +++ b/scripts/lib/daemon-triage.sh @@ -437,7 +437,9 @@ select_pipeline_template() { fi # ── Score-based selection ── - if [[ "$score" -ge 70 ]]; then + if [[ "$score" -ge 90 ]]; then + echo "minimal" + elif [[ "$score" -ge 70 ]]; then echo "fast" elif [[ "$score" -ge 40 ]]; then echo "standard" diff --git a/scripts/lib/project-detect.sh b/scripts/lib/project-detect.sh index 572d1c757..7d9b46b0d 100644 --- a/scripts/lib/project-detect.sh +++ b/scripts/lib/project-detect.sh @@ -513,7 +513,13 @@ project_recommend_template() { -not -path "*/target/*" 2>/dev/null -exec cat {} + 2>/dev/null | wc -l | tr -d ' ') # Heuristics for template recommendation - if [[ "$src_file_count" -lt 20 && "$test_file_count" -lt 5 && "$has_deploy" == "false" ]]; then + if [[ "$src_file_count" -le 5 && "$has_deploy" == "false" && "$has_ci" == "false" ]]; then + # Trivial project — single-file or handful of files + template="minimal" + confidence=90 + reason="Trivial project with very few source files" + + elif [[ "$src_file_count" -lt 20 && "$test_file_count" -lt 5 && "$has_deploy" == "false" ]]; then # Small project, no tests, no deploy template="fast" confidence=85 diff --git a/scripts/sw-lib-daemon-triage-test.sh b/scripts/sw-lib-daemon-triage-test.sh index 9ae86dfab..ebfd853ef 100755 --- a/scripts/sw-lib-daemon-triage-test.sh +++ b/scripts/sw-lib-daemon-triage-test.sh @@ -203,6 +203,10 @@ result=$(printf '%s' "$result" | tr -cd '[:alnum:]-') assert_eq "Incident label → hotfix template" "hotfix" "$result" # Score-based (no special labels) +result=$(select_pipeline_template "enhancement" 95 2>/dev/null | tail -1) +result=$(printf '%s' "$result" | tr -cd '[:alnum:]-') +assert_eq "Very high score → minimal template" "minimal" "$result" + result=$(select_pipeline_template "enhancement" 75 2>/dev/null | tail -1) result=$(printf '%s' "$result" | tr -cd '[:alnum:]-') assert_eq "High score → fast template" "fast" "$result" diff --git a/scripts/sw-project-detect-test.sh b/scripts/sw-project-detect-test.sh index 900c6bdb6..cd70541aa 100755 --- a/scripts/sw-project-detect-test.sh +++ b/scripts/sw-project-detect-test.sh @@ -277,13 +277,27 @@ assert_eq "Rust build command" "cargo build" "$result" # Test: project_recommend_template # ═══════════════════════════════════════════════════════════════════════════════ -print_test_section "project_recommend_template — Small project" +print_test_section "project_recommend_template — Trivial project (minimal)" + +test_proj="$TEST_TEMP_DIR/trivial-project" +create_nodejs_project "$test_proj" +# Trivial project (0 source files, no CI, no deploy) should get "minimal" +result=$(project_recommend_template "$test_proj") || true +assert_json_key "Trivial project recommends minimal" "$result" ".template" "minimal" + +# ─── Test: project_recommend_template — Small project (fast) ───────────────── +print_test_section "project_recommend_template — Small project (fast)" test_proj="$TEST_TEMP_DIR/small-project" create_nodejs_project "$test_proj" +# Add enough source files to exceed the minimal threshold (>5 files) +mkdir -p "$test_proj/src" +for i in 1 2 3 4 5 6 7 8; do + echo "module.exports = {};" > "$test_proj/src/file${i}.js" +done result=$(project_recommend_template "$test_proj") || true -# Small project should get "fast" template +# Small project with >5 source files should get "fast" template assert_json_key "Small project recommends fast" "$result" ".template" "fast" # ─── Test: project_recommend_template with Docker ──────────────────────────── diff --git a/scripts/sw-upgrade.sh b/scripts/sw-upgrade.sh index cfa50e116..82290ca46 100755 --- a/scripts/sw-upgrade.sh +++ b/scripts/sw-upgrade.sh @@ -186,6 +186,7 @@ FILES=( "devops.json|tmux/templates/devops.json|$HOME/.shipwright/templates/devops.json|false|false" "architecture.json|tmux/templates/architecture.json|$HOME/.shipwright/templates/architecture.json|false|false" "definition-of-done.example.md|docs/definition-of-done.example.md|$HOME/.shipwright/templates/definition-of-done.example.md|false|false" + "pipeline-minimal.json|templates/pipelines/minimal.json|$HOME/.shipwright/pipelines/minimal.json|false|false" "pipeline-standard.json|templates/pipelines/standard.json|$HOME/.shipwright/pipelines/standard.json|false|false" "pipeline-fast.json|templates/pipelines/fast.json|$HOME/.shipwright/pipelines/fast.json|false|false" "pipeline-full.json|templates/pipelines/full.json|$HOME/.shipwright/pipelines/full.json|false|false" diff --git a/templates/pipelines/minimal.json b/templates/pipelines/minimal.json new file mode 100644 index 000000000..98d438d6e --- /dev/null +++ b/templates/pipelines/minimal.json @@ -0,0 +1,74 @@ +{ + "name": "minimal", + "description": "Trivial single-file fixes: intake → build → test → PR (no quality gates, no intelligence, minimal iterations)", + "defaults": { + "test_cmd": "npm test", + "model": "haiku", + "agents": 1 + }, + "intelligence": { + "adversarial_enabled": false, + "architecture_enabled": false, + "simulation_enabled": false + }, + "stages": [ + { + "id": "intake", + "enabled": true, + "gate": "auto", + "config": {} + }, + { + "id": "plan", + "enabled": false, + "gate": "auto", + "config": {} + }, + { + "id": "build", + "enabled": true, + "gate": "auto", + "config": { + "max_iterations": 3, + "audit": false, + "quality_gates": false + } + }, + { + "id": "test", + "enabled": true, + "gate": "auto", + "config": { "coverage_min": 0 } + }, + { + "id": "review", + "enabled": false, + "gate": "auto", + "config": {} + }, + { + "id": "compound_quality", + "enabled": false, + "gate": "auto", + "config": {} + }, + { + "id": "pr", + "enabled": true, + "gate": "auto", + "config": { "wait_ci": false } + }, + { + "id": "deploy", + "enabled": false, + "gate": "auto", + "config": {} + }, + { + "id": "validate", + "enabled": false, + "gate": "auto", + "config": {} + } + ] +} From ae87b3c20f36d7004141f84c9b998118b83bc21c Mon Sep 17 00:00:00 2001 From: Seth Ford Date: Mon, 9 Mar 2026 06:54:02 -0400 Subject: [PATCH 2/8] fix: add compound_quality_blocking to minimal template for consistency Co-Authored-By: Claude Opus 4.6 --- templates/pipelines/minimal.json | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/templates/pipelines/minimal.json b/templates/pipelines/minimal.json index 98d438d6e..817a461f5 100644 --- a/templates/pipelines/minimal.json +++ b/templates/pipelines/minimal.json @@ -50,7 +50,9 @@ "id": "compound_quality", "enabled": false, "gate": "auto", - "config": {} + "config": { + "compound_quality_blocking": true + } }, { "id": "pr", From 768e27baa82ac89c4a1244954a5e85aecdeb9524 Mon Sep 17 00:00:00 2001 From: Seth Ford Date: Mon, 9 Mar 2026 06:54:21 -0400 Subject: [PATCH 3/8] =?UTF-8?q?loop:=20iteration=201=20=E2=80=94=20autonom?= =?UTF-8?q?ous=20progress?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .claude/loop-logs/iteration-1.log | 12 +-- .claude/loop-state.md | 137 ++++++++++-------------------- 2 files changed, 48 insertions(+), 101 deletions(-) diff --git a/.claude/loop-logs/iteration-1.log b/.claude/loop-logs/iteration-1.log index 625f2153b..c09d71d55 100644 --- a/.claude/loop-logs/iteration-1.log +++ b/.claude/loop-logs/iteration-1.log @@ -1,11 +1 @@ -Perfect! The goal has been fully achieved. The shipwright CLI now has a `hello` command that outputs "hello world". The implementation: - -1. ✅ Added the `hello` command to the help text -2. ✅ Implemented the command case statement in the main router -3. ✅ Tested the command works correctly -4. ✅ All tests pass -5. ✅ Committed with a descriptive message - -The command is now available as `shipwright hello` (and with aliases `sw hello` and `cct hello`). - -LOOP_COMPLETE +{"type":"result","subtype":"success","is_error":false,"duration_ms":529352,"duration_api_ms":415316,"num_turns":65,"result":"All tests pass. Here's what was implemented:\n\n## Summary\n\n**New `minimal` pipeline template** (`templates/pipelines/minimal.json`):\n- Stages: intake → build → test → PR (compound_quality, plan, review, deploy all disabled)\n- Model: `haiku` (cheapest/fastest)\n- Max build iterations: 3 (vs 10 for fast/hotfix)\n- All intelligence disabled (no adversarial, architecture, or simulation)\n- No quality gates\n\n**Integration points updated:**\n- **Project detection** (`scripts/lib/project-detect.sh`): Recommends `minimal` for projects with ≤5 source files, no CI, no deploy\n- **Daemon triage** (`scripts/lib/daemon-triage.sh`): Score-based selection uses `minimal` for score ≥90 (very trivial issues)\n- **Upgrade** (`scripts/sw-upgrade.sh`): Added to file distribution list\n- **CLAUDE.md** (both project and global): Updated pipeline template tables\n- **Tests**: Updated project-detect and daemon-triage tests with new assertions\n\nLOOP_COMPLETE","stop_reason":"end_turn","session_id":"2f82583e-c851-41f2-9d8a-bb99e0145eab","total_cost_usd":2.91068575,"usage":{"input_tokens":56,"cache_creation_input_tokens":93351,"cache_read_input_tokens":3929906,"output_tokens":13527,"server_tool_use":{"web_search_requests":0,"web_fetch_requests":0},"service_tier":"standard","cache_creation":{"ephemeral_1h_input_tokens":93351,"ephemeral_5m_input_tokens":0},"inference_geo":"","iterations":[],"speed":"standard"},"modelUsage":{"claude-opus-4-6":{"inputTokens":56,"outputTokens":13527,"cacheReadInputTokens":3929906,"cacheCreationInputTokens":93351,"webSearchRequests":0,"costUSD":2.8868517499999995,"contextWindow":200000,"maxOutputTokens":32000},"claude-haiku-4-5-20251001":{"inputTokens":20379,"outputTokens":691,"cacheReadInputTokens":0,"cacheCreationInputTokens":0,"webSearchRequests":0,"costUSD":0.023834000000000004,"contextWindow":200000,"maxOutputTokens":32000}},"permission_denials":[],"fast_mode_state":"off","uuid":"ff624623-f744-46a5-a04c-78711b713e43"} diff --git a/.claude/loop-state.md b/.claude/loop-state.md index 0362dc9e6..0750ebb95 100644 --- a/.claude/loop-state.md +++ b/.claude/loop-state.md @@ -1,104 +1,71 @@ --- -goal: "Add a shipwright ping command that prints pong to stdout and exits 0 - -## Plan Summary -Plan complete and saved to `docs/plans/2026-03-02-ping-command.md`. - ---- - -## Summary - -The plan adds the `shipwright ping` command in **4 files, 9 tasks**: - -| # | Task | File(s) | -|---|------|---------| -| 1-2 | Create + chmod `sw-ping.sh` | `scripts/sw-ping.sh` (new) | -| 3-4 | Create + chmod `sw-ping-test.sh` | `scripts/sw-ping-test.sh` (new) | -| 5 | Run test in isolation — verify 6 PASS | — | -| 6 | Register `ping)` case in router | `scripts/sw` | -| 7 | Add test to `npm test` chain | `package.json` | -| 8 | Smoke-test via router | — | -| 9 | Commit | — | - -**Key decisions:** -- **Standalone script** (not inline in router) — only approach consistent with all 100+ existing commands, independently testable -[... full plan in .claude/pipeline-artifacts/plan.md] - -## Key Design Decisions -# Design: Add a shipwright ping command that prints pong to stdout and exits 0 -## Context -## Component Diagram -## Decision -## Interface Contracts -# sw-ping.sh — Public interface -# Invocation (no args): happy path -# stdout: "pong\n" -# stderr: (empty) -# exit: 0 -[... full design in .claude/pipeline-artifacts/design.md] +goal: "Add "minimal" pipeline template for trivial single-file fixes Historical context (lessons from previous pipelines): { "results": [ { - "file": "architecture.json", - "relevance": 95, - "summary": "Describes Command Router pattern, bash 3.2 conventions (set -euo pipefail, VERSION at top), snake_case function naming, and test harness structure — exactly what's needed to implement the ping command correctly" + "file": "failures.json", + "relevance": 75, + "summary": "Documents test failures related to pipeline stage artifacts (plan.md, review.md) not being created. Directly relevant to understanding pipeline template structure and artifact expectations for the minimal template." }, { - "file": "failures.json (comprehensive with 8 entries)", - "relevance": 85, - "summary": "Shows critical historical failures including 'output missing: intake' (23 occurrences, highest weight 7.8e+47), shell-init errors, and test infrastructure issues — directly relevant to avoiding similar failures in build stage" - }, - { - "file": "metrics.json (build_duration_s: 2826)", + "file": "metrics.json", "relevance": 55, - "summary": "Previous build took 47 minutes — provides performance baseline and expectation setting for current build duration" + "summary": "Baseline metrics show build_duration_s: 147, test_duration_s: 1, iterations: 1. Relevant for designing a minimal template optimized for trivial single-file fixes with faster execution." }, { - "file": "failures.json (shell-init: error retrieving current directory)", - "relevance": 50, - "summary": "Test stage failure in getcwd — indicates potential sandbox/environment issues that could affect ping command testing" + "file": "patterns.json", + "relevance": 45, + "summary": "Project conventions capture: nodejs, vitest test runner, npm package manager, src/ source directory. Provides project context for what the minimal template should support." }, { - "file": "patterns.json (import_style: commonjs)", + "file": "patterns.json", "relevance": 30, - "summary": "Indicates JavaScript/Node.js project context; mostly empty but shows partial project type detection from previous runs" + "summary": "Basic project type detection (nodejs, 2026-02-21). Less detailed than other patterns entry; provides minimal additional context." + }, + { + "file": "global.json", + "relevance": 10, + "summary": "Currently empty cross-repo learnings. May capture insights from this work for future use, but contains no current relevant data." } ] } Discoveries from other pipelines: -✓ Injected 1 new discoveries -[design] Design completed for Add a shipwright ping command that prints pong to stdout and exits 0 — Resolution: - -## Failure Diagnosis (Iteration 2) -Classification: unknown -Strategy: retry_with_context -Repeat count: 0 - -## Failure Diagnosis (Iteration 3) -Classification: unknown -Strategy: retry_with_context -Repeat count: 1 - -## Failure Diagnosis (Iteration 4) -Classification: unknown -Strategy: retry_with_context -Repeat count: 0" -iteration: 4 -max_iterations: 20 -status: error +✓ Injected 23 new discoveries +[intake] Stage intake completed — Resolution: +[intake] Stage intake completed — Resolution: +[intake] Stage intake completed — Resolution: +[compound_quality] Stage compound_quality completed — Resolution: +[intake] Stage intake completed — Resolution: +[pr] Stage pr completed — Resolution: +[pipeline_success] Pipeline success for issue #0 (fast template, stage=validate) — Resolution: success +[intake] Stage intake completed — Resolution: +[pr] Stage pr completed — Resolution: +[intake] Stage intake completed — Resolution: +[compound_quality] Stage compound_quality completed — Resolution: +[pr] Stage pr completed — Resolution: +[intake] Stage intake completed — Resolution: +[compound_quality] Stage compound_quality completed — Resolution: +[pr] Stage pr completed — Resolution: +[intake] Stage intake completed — Resolution: +[design] Design completed for Build a production-grade todo application. TypeScript + React frontend with Vite, Express REST API backend, SQLite persistence with Drizzle ORM, JWT authentication (register/login), full CRUD for todos with filtering (all/active/completed), drag-and-drop reorder, due dates, priorities (low/medium/high), dark mode, responsive design. Include comprehensive test suite (unit + integration + e2e). Production-ready: error handling, input validation, rate limiting, CORS, environment config. — Resolution: +[intake] Stage intake completed — Resolution: +[intake] Stage intake completed — Resolution: " +iteration: 0 +max_iterations: 10 +status: running test_cmd: "npm test" -model: sonnet +model: opus agents: 1 -started_at: 2026-03-02T08:27:01Z -last_iteration_at: 2026-03-02T08:27:01Z -consecutive_failures: 1 -total_commits: 3 -audit_enabled: true -audit_agent_enabled: true -quality_gates_enabled: true +started_at: 2026-03-09T10:45:28Z +last_iteration_at: 2026-03-09T10:45:28Z +consecutive_failures: 0 +total_commits: 0 +audit_enabled: false +audit_agent_enabled: false +quality_gates_enabled: false dod_file: "" auto_extend: true extension_count: 0 @@ -106,14 +73,4 @@ max_extensions: 3 --- ## Log -### Iteration 1 (2026-03-02T08:06:08Z) -This is also a task notification for a background command that was already retrieved and reviewed via `TaskOutput` in th -No new information — the ping command implementation is complete and `LOOP_COMPLETE` was already declared. - -### Iteration 2 (2026-03-02T08:25:28Z) -The background task already completed and was retrieved in my previous turn — `npm test` exited with code 0. The ping co -LOOP_COMPLETE - -### Iteration 3 (2026-03-02T08:26:58Z) -(no output) From 4a40b7c4be282223188f5ac971a4e7f9aa714ac6 Mon Sep 17 00:00:00 2001 From: Seth Ford Date: Mon, 9 Mar 2026 07:11:31 -0400 Subject: [PATCH 4/8] =?UTF-8?q?loop:=20iteration=201=20=E2=80=94=20post-au?= =?UTF-8?q?dit=20cleanup?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .claude/intelligence-cache.json | 2 +- .claude/loop-logs/tests-iter-1.log | 4253 +++++++++++++++++++++++++--- .claude/platform-hygiene.json | 24 +- 3 files changed, 3809 insertions(+), 470 deletions(-) diff --git a/.claude/intelligence-cache.json b/.claude/intelligence-cache.json index 9a521f319..c25fe39a6 100644 --- a/.claude/intelligence-cache.json +++ b/.claude/intelligence-cache.json @@ -263,7 +263,7 @@ } } ], - "timestamp": 1772984911, + "timestamp": 1773054569, "ttl": 1800 }, "fdbb7bc340414cf938cecd4b790fa9e2": { diff --git a/.claude/loop-logs/tests-iter-1.log b/.claude/loop-logs/tests-iter-1.log index 4e762dfc3..12d73bc14 100644 --- a/.claude/loop-logs/tests-iter-1.log +++ b/.claude/loop-logs/tests-iter-1.log @@ -1,557 +1,695 @@ -> shipwright-cli@1.7.1 test -> bash scripts/sw-pipeline-test.sh && bash scripts/sw-daemon-test.sh && bash scripts/sw-prep-test.sh && bash scripts/sw-fleet-test.sh && bash scripts/sw-fix-test.sh && bash scripts/sw-memory-test.sh && bash scripts/sw-session-test.sh && bash scripts/sw-init-test.sh && bash scripts/sw-tracker-test.sh && bash scripts/sw-heartbeat-test.sh && bash scripts/sw-remote-test.sh && bash scripts/sw-intelligence-test.sh && bash scripts/sw-pipeline-composer-test.sh && bash scripts/sw-self-optimize-test.sh && bash scripts/sw-predictive-test.sh && bash scripts/sw-frontier-test.sh && bash scripts/sw-connect-test.sh +> shipwright-cli@3.2.4 test +> bash scripts/sw-agi-roadmap-test.sh && bash scripts/sw-activity-test.sh && bash scripts/sw-adaptive-test.sh && bash scripts/sw-adversarial-test.sh && bash scripts/sw-architecture-enforcer-test.sh && bash scripts/sw-auth-test.sh && bash scripts/sw-autonomous-test.sh && bash scripts/sw-changelog-test.sh && bash scripts/sw-checkpoint-test.sh && bash scripts/sw-ci-test.sh && bash scripts/sw-cleanup-test.sh && bash scripts/sw-code-review-test.sh && bash scripts/sw-connect-test.sh && bash scripts/sw-context-budget-test.sh && bash scripts/sw-context-test.sh && bash scripts/sw-cost-test.sh && bash scripts/sw-daemon-test.sh && bash scripts/sw-dashboard-test.sh && bash scripts/sw-db-test.sh && bash scripts/sw-decompose-test.sh && bash scripts/sw-decide-test.sh && bash scripts/sw-deps-test.sh && bash scripts/sw-developer-simulation-test.sh && bash scripts/sw-discovery-test.sh && bash scripts/sw-doc-fleet-test.sh && bash scripts/sw-docs-agent-test.sh && bash scripts/sw-docs-test.sh && bash scripts/sw-doctor-test.sh && bash scripts/sw-dora-test.sh && bash scripts/sw-durable-test.sh && bash scripts/sw-e2e-orchestrator-test.sh && bash scripts/sw-eventbus-test.sh && bash scripts/sw-feedback-test.sh && bash scripts/sw-fix-test.sh && bash scripts/sw-fleet-discover-test.sh && bash scripts/sw-fleet-test.sh && bash scripts/sw-fleet-viz-test.sh && bash scripts/sw-frontier-test.sh && bash scripts/sw-github-app-test.sh && bash scripts/sw-github-checks-test.sh && bash scripts/sw-github-deploy-test.sh && bash scripts/sw-github-graphql-test.sh && bash scripts/sw-guild-test.sh && bash scripts/sw-heartbeat-test.sh && bash scripts/sw-hello-test.sh && bash scripts/sw-hygiene-test.sh && bash scripts/sw-incident-test.sh && bash scripts/sw-init-test.sh && bash scripts/sw-instrument-test.sh && bash scripts/sw-intelligence-test.sh && bash scripts/sw-jira-test.sh && bash scripts/sw-launchd-test.sh && bash scripts/sw-linear-test.sh && bash scripts/sw-logs-test.sh && bash scripts/sw-loop-test.sh && bash scripts/sw-memory-test.sh && bash scripts/sw-mission-control-test.sh && bash scripts/sw-model-router-test.sh && bash scripts/sw-otel-test.sh && bash scripts/sw-oversight-test.sh && bash scripts/sw-patrol-meta-test.sh && bash scripts/sw-pipeline-composer-test.sh && bash scripts/sw-pipeline-test.sh && bash scripts/sw-pipeline-vitals-test.sh && bash scripts/sw-pm-test.sh && bash scripts/sw-pr-lifecycle-test.sh && bash scripts/sw-predictive-test.sh && bash scripts/sw-prep-test.sh && bash scripts/sw-ps-test.sh && bash scripts/sw-public-dashboard-test.sh && bash scripts/sw-quality-test.sh && bash scripts/sw-reaper-test.sh && bash scripts/sw-recruit-test.sh && bash scripts/sw-regression-test.sh && bash scripts/sw-release-manager-test.sh && bash scripts/sw-release-test.sh && bash scripts/sw-root-cause-test.sh && bash scripts/sw-remote-test.sh && bash scripts/sw-replay-test.sh && bash scripts/sw-retro-test.sh && bash scripts/sw-scale-test.sh && bash scripts/sw-stall-detector-test.sh && bash scripts/sw-security-audit-test.sh && bash scripts/sw-self-optimize-test.sh && bash scripts/sw-session-test.sh && bash scripts/sw-setup-test.sh && bash scripts/sw-standup-test.sh && bash scripts/sw-status-test.sh && bash scripts/sw-strategic-test.sh && bash scripts/sw-stream-test.sh && bash scripts/sw-swarm-test.sh && bash scripts/sw-team-stages-test.sh && bash scripts/sw-templates-test.sh && bash scripts/sw-testgen-test.sh && bash scripts/sw-tmux-pipeline-test.sh && bash scripts/sw-tmux-test.sh && bash scripts/sw-trace-test.sh && bash scripts/sw-tracker-test.sh && bash scripts/sw-triage-test.sh && bash scripts/sw-upgrade-test.sh && bash scripts/sw-ux-test.sh && bash scripts/sw-webhook-test.sh && bash scripts/sw-widgets-test.sh && bash scripts/sw-worktree-test.sh && bash scripts/sw-lib-compat-test.sh && bash scripts/sw-lib-helpers-test.sh && bash scripts/sw-lib-error-actionability-test.sh && bash scripts/sw-lib-daemon-dispatch-test.sh && bash scripts/sw-lib-daemon-failure-test.sh && bash scripts/sw-lib-daemon-poll-test.sh && bash scripts/sw-lib-daemon-state-test.sh && bash scripts/sw-lib-daemon-triage-test.sh && bash scripts/sw-lib-daemon-patrol-test.sh && bash scripts/sw-lib-pipeline-detection-test.sh && bash scripts/sw-lib-pipeline-intelligence-test.sh && bash scripts/sw-lib-pipeline-quality-checks-test.sh && bash scripts/sw-lib-pipeline-stages-test.sh && bash scripts/sw-lib-pipeline-state-test.sh && bash scripts/sw-adapters-test.sh && bash scripts/sw-evidence-test.sh && bash scripts/sw-review-rerun-test.sh && bash scripts/sw-tracker-providers-test.sh && bash scripts/sw-budget-chaos-test.sh && bash scripts/sw-chaos-test.sh && bash scripts/sw-autonomous-e2e-test.sh && bash scripts/sw-memory-discovery-e2e-test.sh && bash scripts/sw-policy-e2e-test.sh && bash scripts/sw-e2e-smoke-test.sh && bash scripts/sw-dashboard-e2e-test.sh + + +╔════════════════════════════════════════════════════════════════╗ +║ AGI-Roadmap Validation — Real Tests for Every Feature ║ +╚════════════════════════════════════════════════════════════════╝ + +Phase 1: Feedback Loops (Discovery, Memory, PM, Failure Learning) + ▸ Daemon: failure_history initialized in state JSON... ✓ + ▸ Daemon: consecutive count uses reduce (run-from-newest, not total)... ✓ + ▸ Daemon: get_max_retries_for_class returns per-class limits... ✓ + ▸ Daemon: exponential backoff formula 5*2^(n-3) is correct... ✓ + ▸ Daemon: resume_after parsed in UTC (not local TZ)... ✓ + ▸ PM: recommend --json flag implemented... ✓ + ▸ PM: learn subcommand functional... ✓ + ▸ Daemon: PM recommend/learn wired into triage + success/failure... ✓ + ▸ Daemon: confidence < 60% upgrades to full template... ✓ + +Phase 2: Agent Coordination (Feedback, Predictive, Oversight, Autonomous) + ▸ Feedback: ARTIFACTS_DIR respects caller override... ✓ + ▸ Feedback: rollback uses PIPESTATUS for correct exit code... ✓ + ▸ Predictive: anomaly detection returns severity for 5x baseline... ✓ + ▸ Predictive: inject-prevention command exists and runs... ✓ + ▸ Pipeline: predictive anomaly/baseline/inject-prevention wired... ✓ + ▸ Pipeline: memory metric wired into stage completion... ✓ + ▸ Oversight: gate approves clean review... ✓ + ▸ Oversight: gate rejects with --reject-if... ✓ + ▸ Oversight: gate JSON safe from newline/quote injection... ✓ + ▸ Pipeline: oversight gate wired + respects SKIP_GATES... ✓ + ▸ Pipeline: feedback collect/create-issue/rollback wired into monitor... ✓ + ▸ Pipeline: intelligence prediction validation wired... ✓ + ▸ Pipeline: predictive anomaly confirmation wired... ✓ + ▸ Pipeline: memory fix-outcome negative path wired... ✓ + ▸ Triage: offline fallback with recruit... ✓ + ▸ Recruit: policy integration wired... ✓ + ▸ Recruit: meta feedback loop... ✓ + ▸ Recruit: audit (negative-compounding feedback)... ✓ + ▸ Autonomous: safe recruit_args quoting... ✓ + ▸ Autonomous: Claude output redirected to findings file... ✓ + ▸ Autonomous: dual branch check (pipeline + daemon)... ✓ + ▸ Autonomous: run_scheduler with loop/sleep... ✓ + ▸ Autonomous: trigger_pipeline_for_finding wired... ✓ + ▸ Incident: create_hotfix_issue echoes issue number... ✓ + ▸ Incident: trigger_pipeline wires --template hotfix... ✓ + ▸ Incident: trigger_rollback wires sw-feedback.sh... ✓ + +Phase 3: Quality Assurance (Code Review, Testgen, Swarm, Multi-Agent) + ▸ Code review: run_claude_semantic_review exists... ✓ + ▸ Code review: semantic findings integrated... ✓ + ▸ Testgen: Claude prompt asks for real assertions... ✓ + ▸ Testgen: prompt piped to avoid shell expansion... ✓ + ▸ Swarm: spawn creates tmux session... ✓ + ▸ Swarm: retire kills tmux session... ✓ + ▸ Swarm: spawn/retire functional (real tmux)... ✓ + ▸ Loop: multi-agent restarts not blocked... ✓ + +Phase 4: Meta-Cognition (Effectiveness, Self-Awareness, Capability) + ▸ Pipeline: record_stage_effectiveness creates valid JSONL... ✓ + ▸ Pipeline: self-awareness hint triggers on >50% failure rate... ✓ + ▸ Pipeline: effectiveness recorded on both complete and failed... ✓ + ▸ Pipeline: discovery inject wired... ✓ + ▸ Pipeline: self-awareness hint injected into plan prompt... ✓ + +Integration & Safety + ▸ CI: integration-claude jobs in workflow... ✓ + ▸ Integration-claude: skip path functional... ✓ + ▸ All modified scripts have strict mode... ✓ + ▸ All modified scripts have ERR trap... ✓ + ▸ No hardcoded secrets in scripts... ✓ + + ────────────────────────────────────────── + + All 53 tests passed! + + + Shipwright Activity Tests + ══════════════════════════════════════════ + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ Source guard pattern (if/then/fi) + ✓ VERSION variable defined + + Help Output + ✓ help exits 0 and contains USAGE + ✓ help lists 'watch' subcommand + ✓ help lists 'snapshot' subcommand + ✓ help lists 'history' subcommand + ✓ help lists 'stats' subcommand + ✓ help lists 'agents' subcommand + ✓ --help alias works + + Error Handling +shipwright activity — Live agent activity stream + +USAGE + shipwright activity [subcommand] [options] + +SUBCOMMANDS + watch Live stream of agent activity (default) + snapshot Current state of all active agents + history [range] Replay past activity (e.g., '1h', '10m', 'all') + stats Running counters (events, commits, tests, agents) + agents List known agents and last activity + help Show this help message + +OPTIONS + --type Filter events by type (e.g., 'stage.completed') + --agent Filter by agent name + --team Filter by team + --stage Filter by pipeline stage (e.g., 'build') + +EXAMPLES + \033[2mshipwright activity\033[0m # Live stream + \033[2mshipwright activity watch --type stage.*\033[0m # Only stage events + \033[2mshipwright activity history 1h\033[0m # Last hour + \033[2mshipwright activity snapshot\033[0m # Current state + \033[2mshipwright activity stats\033[0m # Counters + ✓ Unknown command exits non-zero + + Subcommands Without Events File + ✓ snapshot exits non-zero with no events + ✓ stats exits non-zero with no events + ✓ agents exits non-zero with no events + + Event Emission + ✓ emit_event creates events.jsonl + ✓ emit_event writes valid JSON + ✓ Event contains type field + ✓ Event contains agent field + ✓ Event contains numeric count + + Format Helpers + ✓ get_icon_for_type commit returns icon + ✓ get_icon_for_type test.passed returns icon + ✓ get_icon_for_type unknown returns bullet + ✓ format_timestamp strips T and Z + + Stats With Events + ✓ stats shows Total Events + ✓ stats shows Commits count + ✓ stats shows Pipelines count + History Subcommand + ✓ history all shows activity header -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright pipeline test — E2E Validation (Real Subprocess) ║ -╚═══════════════════════════════════════════════════════════════════╝ -Setting up mock environment... -✓ Environment ready: /tmp/sw-pipeline-test.vZ9o8O + ────────────────────────────────────────── - ▸ Preflight passes with all mocks... ✓ - ▸ Preflight fails when sw-loop.sh missing... ✓ - ▸ Start requires --goal or --issue... ✓ - ▸ Intake with --goal creates branch + artifacts... ✓ - ▸ Intake with --issue fetches from GitHub... ✓ - ▸ Plan generates plan.md, dod.md, tasks... ✓ - ▸ Build invokes sw loop and commits... ✓ - ▸ Test stage captures results to log... ✓ - ▸ Review generates report with severities... ✓ - ▸ PR stage creates PR URL artifact... ✓ - ▸ Full E2E pipeline (6 stages)... ✓ - ▸ Resume continues from partial state... ✓ - ▸ Abort marks pipeline as aborted... ✓ - ▸ Dry run shows config, no artifacts... ✓ - ▸ Self-healing build→test retry loop... ✓ + All 28 tests passed -━━━ Results ━━━ - Passed: 15 - Failed: 0 - Total: 15 -All 15 tests passed! + Shipwright Adaptive Tests + ══════════════════════════════════════════ + help / version + ✓ help exits 0 + ✓ help shows USAGE + ✓ help shows SUBCOMMANDS + ✓ help mentions get + ✓ help mentions train + ✓ help mentions profile + ✓ version exits 0 + ✓ version output contains version string -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright daemon test — Unit Tests (Synthetic Events) ║ -╚═══════════════════════════════════════════════════════════════════╝ + error handling + ✓ Unknown command exits non-zero -Setting up test environment... -✓ Environment ready: /tmp/sw-daemon-test.Ootf7j + get command + ✓ get timeout with default exits 0 - ▸ dora_grade deploy_freq Elite (>= 7)... ✓ - ▸ dora_grade deploy_freq High (>= 1)... ✓ - ▸ dora_grade deploy_freq Medium (>= 0.25)... ✓ - ▸ dora_grade deploy_freq Low (< 0.25)... ✓ - ▸ dora_grade CFR thresholds (Elite/High/Medium/Low)... ✓ - ▸ Stage timings filter-first jq query... ✓ - ▸ MTTR pairs failures with next success... ✓ - ▸ epoch_to_iso helper function... ✓ - ▸ Health check detects stale jobs... ✓ - ▸ Priority label sorting... ✓ - ▸ Degradation alert triggers on high CFR... ✓ - ▸ Metrics --json output with cycle_time keys... ✓ - ▸ Self-labeling includes watch_label when enabled... ✓ - ▸ Self-labeling excludes watch_label when disabled... ✓ - ▸ Patrol recurring failures label construction... ✓ - ▸ DORA degradation event detection... ✓ - ▸ Retry exhaustion event detection... ✓ - ▸ Untested script detection logic... ✓ + profile command + ✓ profile exits 0 -━━━ Results ━━━ - Passed: 18 - Failed: 0 - Total: 18 + reset command + ✓ reset exits 0 -All 18 tests passed! + script safety + ✓ Uses set -euo pipefail + ✓ Has source guard pattern + statistical functions + ✓ percentile, mean, median functions defined in source + ✓ mean returns numeric value (avg of 1-5 is 3) -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright prep test — Validation Suite (Real Subprocess) ║ -╚═══════════════════════════════════════════════════════════════════╝ + get_timeout / get_iterations / get_model + ✓ get_timeout returns number (default with no events) + ✓ get_iterations returns number + ✓ get_model returns valid model name -Setting up test environment... -✓ Environment ready: /tmp/sw-prep-test.1SpJfU - - ▸ Node.js project detection... ✓ - ▸ Python project detection... ✓ - ▸ Go project detection... ✓ - ▸ Rust project detection... ✓ - ▸ settings.json is valid JSON... ✓ - ▸ Hook scripts are executable... ✓ - ▸ Hook scripts have valid syntax... ✓ - ▸ CLAUDE.md has required sections... ✓ - ▸ Check mode outputs scoring... ✓ - ▸ Idempotency without --force... ✓ - ▸ --force overwrites modified files... ✓ - ▸ No eval in generated hooks... ✓ - ▸ Definition of Done generated... ✓ + train subcommand + ✓ train subcommand runs with mock events -━━━ Results ━━━ - Passed: 13 - Failed: 0 - Total: 13 + ────────────────────────────────────────── -All 13 tests passed! + All 20 tests passed -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright fleet test — Unit Tests ║ -╚═══════════════════════════════════════════════════════════════════╝ -Setting up test environment... -✓ Environment ready: /tmp/sw-fleet-test.lIcp2s + shipwright adversarial test + ══════════════════════════════════════════ - ▸ Help output contains expected sections... ✓ - ▸ Help via --help flag... ✓ - ▸ Config parsing — valid config... ✓ - ▸ Config parsing — missing config file... ✓ - ▸ Config parsing — invalid JSON... ✓ - ▸ Config parsing — empty repos array... ✓ - ▸ Config defaults applied... ✓ - ▸ Fleet init generates config template... ✓ - ▸ Fleet init skips when config exists... ✓ - ▸ Fleet start spawns tmux sessions per repo... ✓ - ▸ Fleet start skips missing repos... ✓ - ▸ Fleet start skips existing sessions... ✓ - ▸ Fleet start creates fleet state file... ✓ - ▸ Fleet start emits fleet.started event... ✓ - ▸ Fleet start applies repo-level overrides... ✓ - ▸ Fleet stop kills sessions and cleans state... ✓ - ▸ Fleet stop — no fleet running... ✓ - ▸ Fleet stop emits fleet.stopped event... ✓ - ▸ Fleet status — no fleet running... ✓ - ▸ Fleet status shows dashboard... ✓ - ▸ Fleet metrics — no events file... ✓ - ▸ Fleet metrics dashboard output... ✓ - ▸ Fleet metrics JSON output... ✓ - ▸ Fleet metrics period flag... ✓ - ▸ Session name generation... ✓ - ▸ Fleet start skips non-git repos... ✓ - ▸ Unknown subcommand... ✓ + ══════════════════════════════════════════ -━━━ Results ━━━ - Passed: 27 - Failed: 0 - Total: 27 + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ Source guard pattern (if/then/fi) + ✓ VERSION variable defined -All 27 tests passed! + Help Output + ✓ help exits 0 and contains USAGE + ✓ help lists 'review' subcommand + ✓ help lists 'iterate' subcommand + ✓ help mentions adversarial_enabled flag + ✓ --help alias works + ✓ -h alias works + Error Handling + ✓ Unknown command exits non-zero -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright fix test — Unit Tests ║ -╚═══════════════════════════════════════════════════════════════════╝ + Review Subcommand + ✓ review disabled returns empty JSON array + ✓ review without diff arg exits non-zero -Setting up test environment... -✓ Environment ready: /tmp/sw-fix-test.yrL4tG + Iterate Subcommand + ✓ iterate without args exits non-zero + ✓ iterate with empty findings converges + ✓ iterate past max rounds returns findings - ▸ Help output contains expected sections... ✓ - ▸ Help via --help flag... ✓ - ▸ Missing goal shows error... ✓ - ▸ Missing repos shows error... ✓ - ▸ Arg parsing — --repos comma-separated... ✓ - ▸ Arg parsing — --repos-from file... ✓ - ▸ Arg parsing — --repos-from missing file... ✓ - ▸ Arg parsing — --pipeline template... ✓ - ▸ Arg parsing — --max-parallel... ✓ - ▸ Arg parsing — --branch-prefix... ✓ - ▸ Dry run shows what would happen... ✓ - ▸ Dry run does not create state file... ✓ - ▸ Fix status — no sessions... ✓ - ▸ Fix status shows existing sessions... ✓ - ▸ Invalid repo directory shows error... ✓ - ▸ Fix start creates state file... ✓ - ▸ Fix start emits events... ✓ - ▸ Fix start — summary output... ✓ - ▸ Branch name sanitization... ✓ - ▸ Fix header shows configuration... ✓ - ▸ Non-git repo warning... ✓ - ▸ Per-repo event tracking... ✓ + Configuration + ✓ ADVERSARIAL_MAX_ROUNDS env var respected + ✓ disabled config returns empty array -━━━ Results ━━━ - Passed: 22 - Failed: 0 - Total: 22 + Event Emission + ✓ emit_event creates events.jsonl + ✓ emit_event writes valid JSON + ✓ Event contains type field -All 22 tests passed! -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright memory+cost test — Unit Tests for Memory & Cost ║ -╚═══════════════════════════════════════════════════════════════════╝ -Setting up mock environment... -✓ Environment ready: /tmp/sw-memory-test.IW0uqN + ────────────────────────────────────────── - ▸ Memory capture from pipeline state... ✓ - ▸ Memory inject returns context for each stage... ✓ - ▸ Failure capture stores patterns... ✓ - ▸ Pattern detection identifies project type... ✓ - ▸ Cross-repo vs per-repo isolation... ✓ - ▸ Memory show displays dashboard... ✓ - ▸ Memory search finds matching entries... ✓ - ▸ Memory export produces valid JSON... ✓ - ▸ Memory forget clears repo memory... ✓ - ▸ Cost calculation for each model... ✓ - ▸ Cost recording writes to costs.json... ✓ - ▸ Budget set and check... ✓ - ▸ Cost dashboard runs without errors... ✓ - ▸ Cost JSON output is valid... ✓ - ▸ Actionable failures threshold filtering... ✓ - ▸ Actionable failures with no file returns []... ✓ - ▸ DORA baseline calculation from events... ✓ + All 0 tests passed -━━━ Results ━━━ - Passed: 17 - Failed: 0 - Total: 17 -All 17 tests passed! + shipwright architecture-enforcer test + ══════════════════════════════════════════ -╔═══════════════════════════════════════════════════╗ -║ shipwright session — E2E Test Suite ║ -╚═══════════════════════════════════════════════════╝ + ══════════════════════════════════════════ -Setting up test environment... -Test tmux session: sw-test-1268715 - -Template Loading - ▸ Load feature-dev template (3 agents)... ✓ - ▸ Load exploration template (2 agents)... ✓ - ▸ Missing template returns error... ✓ - ▸ All 24 templates load successfully... ✓ - -Window Creation - ▸ No-launch creates window without claude... ✓ - ▸ Duplicate window detection... ✓ - ▸ Window gets dark theme... ✓ - ▸ Auto-generated team name... ✓ - -Prompt & Launcher - ▸ Launcher script generation with template... ✓ - ▸ Output includes agents from template... ✓ - ▸ Output includes goal when provided... ✓ - ▸ No template + goal works... ✓ - ▸ No template + no goal = interactive... ✓ - -CLI & Configuration - ▸ Help flag... ✓ - ▸ Unknown option... ✓ - ▸ TMPDIR used for launcher files... ✓ + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ Source guard pattern (if/then/fi) + ✓ VERSION variable defined -Enhanced Features - ▸ Template auto-suggestion from goal... ✓ - ▸ No suggestion without goal... ✓ - ▸ Memory injection in prompt... ✓ - ▸ CLAUDE.md reminder in prompt... ✓ - ▸ Secure temp dir in source... ✓ + Help Output + ✓ help exits 0 and contains USAGE + ✓ help lists 'build' subcommand + ✓ help lists 'validate' subcommand + ✓ help lists 'evolve' subcommand + ✓ help mentions architecture_enabled flag + ✓ --help alias works + ✓ -h alias works -════════════════════════════════════════════════════ - All 21 tests passed ✓ -════════════════════════════════════════════════════ + Error Handling + ✓ Unknown command exits non-zero + Build Subcommand + ✓ build disabled returns empty JSON object + ✓ build with enabled returns model with layers + ✓ build model contains patterns array + ✓ build model contains conventions array -╔═══════════════════════════════════════════════════╗ -║ shipwright init — E2E Test Suite ║ -╚═══════════════════════════════════════════════════╝ + Validate Subcommand + ✓ validate disabled returns empty array + ✓ validate without diff arg exits non-zero + ✓ validate without model file returns empty array -Setting up sandboxed environment... -Temp dir: /tmp/sw-init-test.WldBc1 + Evolve Subcommand + ✓ evolve disabled exits 0 + ✓ evolve without model exits 0 -Configuration - ▸ Settings.json created with agent teams... ✓ - ▸ Settings merge preserves existing vars... ✓ - ▸ tmux.conf installed... ✓ - ▸ Overlay installed... ✓ + Model Storage + ✓ build stores model in ~/.shipwright/memory/ + ✓ Stored model is valid JSON -Templates - ▸ Team templates installed (>= 10)... ✓ - ▸ Pipeline templates installed (>= 5)... ✓ - ▸ Legacy templates path populated... ✓ + Event Emission + ✓ emit_event creates events.jsonl + ✓ emit_event writes valid JSON + ✓ Event contains type field -Robustness - ▸ Idempotency — double init safe... ✓ - ▸ Doctor runs at end... ✓ - ▸ Help flag... ✓ -Hook Wiring - ▸ JSONC stripped from settings.json... ✓ - ▸ Hooks wired into settings.json... ✓ - ▸ Hook wiring preserves existing hooks... ✓ - ▸ SessionStart hook installed... ✓ - ▸ Hook wiring with pre-existing settings... ✓ -════════════════════════════════════════════════════ - All 15 tests passed ✓ -════════════════════════════════════════════════════ + ────────────────────────────────────────── -╔═══════════════════════════════════════════════════╗ -║ shipwright tracker — Test Suite ║ -╚═══════════════════════════════════════════════════╝ + All 0 tests passed -Setting up test environment... -Provider Loading - ▸ Provider loads from config (linear)... ✓ - ▸ Provider loads jira... ✓ - ▸ Graceful skip when no provider... ✓ + Shipwright Auth Tests + ══════════════════════════════════════════ -Pipeline Enrichment - ▸ Stage descriptions exist for all 12 stages... ✓ - ▸ Enriched progress body has Delivering line... ✓ - ▸ Enriched progress body has stage descriptions... ✓ - ▸ Pipeline state includes stage_progress... ✓ - ▸ Pipeline state includes stage description... ✓ + Help + ✓ help shows usage + ✓ help shows commands + ✓ --help flag works + ✓ unknown command shows error + Status & Users + ✓ status with no login shows warning + ✓ users with empty shows warning + ✓ token with no user errors + ✓ switch without user exits non-zero + Auth Storage + ✓ auth file has users array + ✓ auth file has active_user + ✓ users lists stored user + ✓ token shows stored token + ✓ user info shows login + ✓ switch to nonexistent errors + ✓ logout succeeds -Integration - ▸ Tracker notify routes to provider (mock)... ✓ - ▸ Dashboard reads goal from pipeline state... ✓ - ▸ Jira config validation... ✓ - ▸ Linear config migration (legacy fallback)... ✓ + ────────────────────────────────────────── -════════════════════════════════════════════════════ - All 12 tests passed ✓ -════════════════════════════════════════════════════ + All 15 tests passed -╔═══════════════════════════════════════════════════╗ -║ shipwright heartbeat + checkpoint — Test Suite ║ -╚═══════════════════════════════════════════════════╝ -Setting up test environment... + Shipwright Autonomous Tests + ══════════════════════════════════════════ -Heartbeat Lifecycle - ▸ Write heartbeat creates JSON file... ✓ - ▸ Check heartbeat reports alive... ✓ Job test-job-alive alive (0s ago) -✓ - ▸ Check heartbeat reports stale... ⚠ Job test-job-stale stale (192940576s ago, timeout: 120s) -✓ - ▸ Clear heartbeat removes file... ✓ Cleared heartbeat for job: test-job-clear -✓ - ▸ List heartbeats returns JSON array... ✓ - ▸ Heartbeat update overwrites existing... ✓ - ▸ Check missing heartbeat returns error... ✓ - ▸ Heartbeat dir auto-created when missing... ✓ + ══════════════════════════════════════════ -Checkpoint Lifecycle - ▸ Checkpoint save creates JSON file... ✓ Checkpoint saved for stage build (iteration 5) -✓ - ▸ Checkpoint restore outputs JSON... ✓ Checkpoint saved for stage test (iteration 3) -✓ - ▸ Checkpoint restore missing stage fails... ✓ - ▸ Checkpoint clear removes file... ✓ Checkpoint saved for stage review (iteration 1) -✓ Cleared checkpoint for stage review -✓ - ▸ Checkpoint clear --all removes all... ✓ Checkpoint saved for stage build (iteration 1) -✓ Checkpoint saved for stage test (iteration 2) -✓ Cleared 2 checkpoint(s) -✓ - ▸ Checkpoint save with files-modified... ✓ Checkpoint saved for stage build (iteration 7) -✓ + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ unknown command exits 1 + ✓ start shows running message + ✓ state.json created after start + ✓ state status is running + ✓ config.json created + ✓ config is valid JSON + ✓ status shows dashboard + ✓ pause sets status to paused + ✓ resume sets status to running + ✓ stop sets status to stopped + ✓ config show displays settings + ✓ config set interval works + ✓ history handles no data -Integration - ▸ Pipeline script has heartbeat functions... ✓ - ▸ Loop script has heartbeat and checkpoint... ✓ - ▸ Pipeline has human intervention checks... ✓ + Config Set/Show Cycle + ✓ config show reflects set value + ✓ config value persists in file -════════════════════════════════════════════════════ - All 17 tests passed ✓ -════════════════════════════════════════════════════ + Config Structure + ✓ config contains key: cycle_interval_minutes + ✓ config contains key: max_issues_per_cycle + ✓ config contains key: daemon_aware + History With Fixture Events + ✓ history shows recent cycles + ✓ history shows cycle entries -╔═══════════════════════════════════════════════════╗ -║ shipwright remote — Test Suite ║ -╚═══════════════════════════════════════════════════╝ + Status Fields + ✓ status when running includes Status + ✓ status when running includes Cycles + ✓ status when running includes Issues Created + ✓ status when running includes Pipelines + ✓ status when running includes Cycle Interval + ✓ status when running shows running + ✓ status when stopped shows stopped -Setting up test environment... -Machine Registry - ▸ Add machine creates machines.json... ▸ Checking shipwright installation at /tmp/sw-remote-test.PmYCBx/mock-install... -✓ Shipwright found at /tmp/sw-remote-test.PmYCBx/mock-install -✓ Registered machine: builder-1 (localhost, worker, 4 workers) -✓ - ▸ Add second machine appends to array... ▸ Checking shipwright installation at /tmp/sw-remote-test.PmYCBx/mock-install... -✓ Shipwright found at /tmp/sw-remote-test.PmYCBx/mock-install -✓ Registered machine: builder-2 (localhost, worker, 8 workers) -✓ - ▸ Remove machine by name... ✓ Removed machine: builder-2 -✓ - ▸ List machines returns output... ✓ - ▸ machines.json uses atomic writes... ✓ - ▸ Duplicate machine name prevented... ▸ Use shipwright remote remove builder-1 first -✓ - ▸ Remote script has help command... ✓ -CLI & Dashboard Integration - ▸ CLI router includes remote command... ✓ - ▸ CLI router includes heartbeat command... ✓ - ▸ CLI router includes checkpoint command... ✓ - ▸ Doctor has heartbeat health check... ✓ - ▸ Doctor has remote machine checks... ✓ - ▸ Status shows heartbeat section... ✓ - ▸ Status shows remote machines section... ✓ + ────────────────────────────────────────── -════════════════════════════════════════════════════ - All 14 tests passed ✓ -════════════════════════════════════════════════════ + All 0 tests passed -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright intelligence test — Unit Tests ║ -╚═══════════════════════════════════════════════════════════════════╝ -Setting up test environment... -✓ Environment ready: /tmp/sw-intelligence-test.wTXBVg + Shipwright Changelog Tests + ══════════════════════════════════════════ - ▸ analyze_issue returns valid schema... ✓ - ▸ Cache hit on second call with same input... ✓ - ▸ Graceful degradation when claude CLI unavailable... ✓ - ▸ compose_pipeline produces valid pipeline JSON... ✓ - ▸ recommend_model returns valid model names... ✓ - ▸ predict_cost returns numeric estimates... ✓ - ▸ Cache TTL expiry returns miss... ✓ - ▸ search_memory returns ranked results... ✓ - ▸ Feature flag disabled returns fallback... ✓ - ▸ Events emitted for analysis... ✓ - ▸ recommend_model emits events... ✓ - ▸ Cache init creates file if missing... ✓ + ══════════════════════════════════════════ -━━━ Results ━━━ - Passed: 12 - Failed: 0 - Total: 12 + help / version + ✓ help exits 0 + ✓ help shows USAGE + ✓ help mentions generate + ✓ help mentions preview + ✓ help mentions version + ✓ help mentions migrate + ✓ VERSION variable defined -All 12 tests passed! + error handling + ✓ Unknown command exits non-zero + formats command + ✓ formats exits 0 -╔═══════════════════════════════════════════════════╗ -║ shipwright pipeline composer — Test Suite ║ -╚═══════════════════════════════════════════════════╝ + generate command + ✓ generate exits 0 -Setting up test environment... + version command + ✓ version recommendation exits 0 -Pipeline Composition - ▸ Composed pipeline has valid stage ordering... ✓ - ▸ High-risk issue gets security stages... ▸ Composing pipeline with intelligence engine... -✓ Composed pipeline: 7 stages -.claude/pipeline-artifacts/composed-pipeline.json -✓ - ▸ Fallback to static template when no intelligence... ✓ + script safety + ✓ Uses set -euo pipefail + ✓ Has source guard pattern -Conditional Stage Insertion - ▸ Stage inserted at correct position after build... ✓ - ▸ Insert into nonexistent stage fails... ✓ -Model Downgrade - ▸ Budget constraint triggers model downgrades... ✓ - ▸ Downgrade with nonexistent stage fails... ✓ -Pipeline Validation - ▸ Validation accepts valid pipeline... ✓ - ▸ Validation rejects invalid ordering (test before build)... ✓ - ▸ Validation rejects missing stage ids... ✓ - ▸ Validation rejects missing stages array... ✓ + ────────────────────────────────────────── -Iteration Estimation - ▸ Iteration estimates are reasonable (1-50 range)... ✓ + All 0 tests passed -════════════════════════════════════════════════════ - All 12 tests passed ✓ -════════════════════════════════════════════════════ -━━━ shipwright self-optimize tests ━━━ + shipwright checkpoint test + ══════════════════════════════════════════ - ▸ Outcome analysis extracts correct metrics... ✓ - ▸ Outcome analysis emits event... ✓ - ▸ Outcome analysis rejects missing file... ✓ - ▸ Template weight increases for high success... ✓ - ▸ Template weight decreases for low success... ✓ - ▸ A/B test selects ~20% sample... ✓ - ▸ Iteration model updates with data points... ✓ - ▸ Model routing tracks success rates... ✓ - ▸ Model routing keeps opus with few sonnet samples... ✓ - ▸ Memory pruning removes old patterns... ✓ - ▸ Memory strengthening boosts confirmed patterns... ✓ - ▸ Memory promotion copies cross-repo patterns... ✓ - ▸ Full analysis runs on empty data... ✓ - ▸ Report generates output with data... ✓ - ▸ Report handles empty outcomes... ✓ - ▸ Outcome analysis extracts stage data... ✓ + ══════════════════════════════════════════ -━━━ Results ━━━ - Passed: 16 - Failed: 0 - Total: 16 + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ VERSION variable defined + + Help Output + ✓ help exits 0 and contains USAGE + ✓ help lists 'save' subcommand + ✓ help lists 'restore' subcommand + ✓ help lists 'list' subcommand + ✓ help lists 'clear' subcommand + ✓ help lists 'expire' subcommand + ✓ --help alias works + ✓ -h alias works + + Error Handling + +shipwright checkpoint v3.2.4 — Save and restore agent state mid-stage + +USAGE + shipwright checkpoint [options] + +COMMANDS + save Save a checkpoint for a stage + restore Restore a checkpoint (prints JSON to stdout) + save-context Save Claude context (goal, findings, test output) for resume + restore-context Restore Claude context (exports RESTORED_* and SW_LOOP_* vars) + list Show all available checkpoints + clear Remove checkpoint(s) + expire Remove checkpoints older than N hours + +SAVE OPTIONS + --stage Stage name (required) + --iteration Current iteration number + --git-sha Git commit SHA (default: HEAD) + --files-modified "f1,f2" Comma-separated list of modified files + --tests-passing Mark tests as passing + --loop-state Loop state (running, paused, etc.) + +RESTORE OPTIONS + --stage Stage to restore (required) + +CLEAR OPTIONS + --stage Stage to clear + --all Clear all checkpoints + +EXPIRE OPTIONS + --hours Max age in hours (default: 24) -All 16 tests passed! +EXAMPLES + shipwright checkpoint save --stage build --iteration 5 + shipwright checkpoint save --stage build --iteration 3 --tests-passing --files-modified "src/auth.ts,src/middleware.ts" + shipwright checkpoint restore --stage build + shipwright checkpoint list + shipwright checkpoint clear --stage build + shipwright checkpoint clear --all + shipwright checkpoint expire --hours 48 + ✓ Unknown command exits non-zero + + Save Subcommand + +shipwright checkpoint v3.2.4 — Save and restore agent state mid-stage + +USAGE + shipwright checkpoint [options] + +COMMANDS + save Save a checkpoint for a stage + restore Restore a checkpoint (prints JSON to stdout) + save-context Save Claude context (goal, findings, test output) for resume + restore-context Restore Claude context (exports RESTORED_* and SW_LOOP_* vars) + list Show all available checkpoints + clear Remove checkpoint(s) + expire Remove checkpoints older than N hours + +SAVE OPTIONS + --stage Stage name (required) + --iteration Current iteration number + --git-sha Git commit SHA (default: HEAD) + --files-modified "f1,f2" Comma-separated list of modified files + --tests-passing Mark tests as passing + --loop-state Loop state (running, paused, etc.) + +RESTORE OPTIONS + --stage Stage to restore (required) + +CLEAR OPTIONS + --stage Stage to clear + --all Clear all checkpoints + +EXPIRE OPTIONS + --hours Max age in hours (default: 24) + +EXAMPLES + shipwright checkpoint save --stage build --iteration 5 + shipwright checkpoint save --stage build --iteration 3 --tests-passing --files-modified "src/auth.ts,src/middleware.ts" + shipwright checkpoint restore --stage build + shipwright checkpoint list + shipwright checkpoint clear --stage build + shipwright checkpoint clear --all + shipwright checkpoint expire --hours 48 + ✓ save without --stage exits non-zero +✓ Checkpoint saved for stage build (iteration 5) + ✓ save creates checkpoint file + ✓ Checkpoint is valid JSON + ✓ Checkpoint stage field correct + ✓ Checkpoint iteration field correct + ✓ Checkpoint git_sha populated +✓ Checkpoint saved for stage test (iteration 0) + ✓ save --tests-passing sets true +✓ Checkpoint saved for stage review (iteration 0) + ✓ save --files-modified stores 2 files +✓ Checkpoint saved for stage deploy (iteration 0) + ✓ save --loop-state stores state + ✓ Checkpoint created_at timestamp present + + Restore Subcommand + ✓ restore returns checkpoint JSON + ✓ Restored checkpoint has correct stage + ✓ restore missing stage exits non-zero + ✓ restore without --stage exits non-zero + + List Subcommand + ✓ list shows Checkpoints header + ✓ list shows build checkpoint + ✓ list shows checkpoint count + ✓ list with no checkpoints shows empty + + Clear Subcommand +✓ Checkpoint saved for stage build (iteration 1) +✓ Checkpoint saved for stage test (iteration 2) +✓ Cleared checkpoint for stage build + ✓ clear --stage removes specific checkpoint + ✓ clear --stage preserves other checkpoints + ✓ clear without args exits non-zero +✓ Checkpoint saved for stage build (iteration 3) +✓ Cleared 2 checkpoint(s) + ✓ clear --all removes all checkpoints + Expire Subcommand + ✓ expire with no checkpoints exits 0 -╔═══════════════════════════════════════════════════════════╗ -║ shipwright predictive test ║ -╚═══════════════════════════════════════════════════════════╝ + Save-context / Restore-context + ✓ save-context creates claude-context.json + ✓ Context goal saved correctly + ✓ Context iteration saved correctly + ✓ restore-context exports RESTORED_GOAL + ✓ restore-context exports SW_LOOP_GOAL -Risk Assessment - ▸ Risk returns valid schema with 0-100 range... ✓ - ▸ Risk elevates for complex keywords... ✓ - ▸ Risk emits event... ✓ -Anomaly Detection - ▸ Critical at 3x threshold (181 vs 60 baseline)... ✓ - ▸ Normal at 2x (119 vs 60 baseline)... ✓ - ▸ Warning between 2x and 3x (150 vs 60)... ✓ - ▸ Normal when no baseline exists... ✓ - ▸ Emits event for critical anomaly... ✓ -Baseline Management - ▸ First data point uses raw value... ✓ - ▸ EMA calculation (0.9*100 + 0.1*200 = 110)... ✓ - ▸ Creates baseline file if missing... ✓ + ────────────────────────────────────────── -Preventative Injection - ▸ Injects context from matching patterns... ✓ - ▸ Returns empty for non-matching stage... ✓ + All 0 tests passed -Graceful Degradation - ▸ Works without intelligence engine... ✓ -AI Patrol - ▸ AI patrol returns structured findings... ✓ -════════════════════════════════════════════════════ - All 15 tests passed ✓ -════════════════════════════════════════════════════ + Shipwright CI Tests + ══════════════════════════════════════════ + ══════════════════════════════════════════ -╔═══════════════════════════════════════════════════╗ -║ shipwright frontier — Test Suite ║ -╚═══════════════════════════════════════════════════╝ + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ version shows version + ✓ unknown command exits 1 + ✓ generate starts processing + ✓ validate runs on valid workflow + ✓ VERSION is defined + ✓ analyze shows analysis + ✓ analyze shows cache info + ✓ matrix generates config + ✓ matrix workflow file exists + ✓ validate passes on valid workflow + ✓ runners list shows options + ✓ runners recommend shows guidance -Setting up test environment... -Adversarial Review - ▸ Adversarial review produces structured findings... ✓ - ▸ Adversarial iteration converges on no critical findings... ✓ -Developer Simulation - ▸ Simulation generates objections from 3 personas... ✓ - ▸ Simulation address returns action items... ✓ + ────────────────────────────────────────── -Architecture Enforcer - ▸ Architecture model has valid schema... ✓ - ▸ Architecture validates changes (violation detected)... ✓ + All 0 tests passed -Graceful Degradation - ▸ All three degrade gracefully when claude unavailable... ✓ -════════════════════════════════════════════════════ - All 7 tests passed ✓ -════════════════════════════════════════════════════ + + Shipwright Cleanup Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help + ✓ help exits 0 + ✓ help shows usage + ✓ help shows --force + ✓ help shows dry-run + + Dry-Run (Empty) + ✓ dry-run shows Tmux Windows section + ✓ dry-run shows Team Configs section + ✓ dry-run shows Task Lists section + ✓ dry-run shows Pipeline Artifacts section + ✓ dry-run shows Pipeline State section + ✓ dry-run shows Heartbeats section + ✓ dry-run reports clean + + Force Mode (Empty) + ✓ force shows FORCE MODE + ✓ force reports nothing to clean + + Detect Team Configs + ✓ dry-run detects team dir + ✓ dry-run shows would remove + + Force Removes Teams + ✓ force removes team dir + ✓ team directory actually removed + + Detect Task Lists + ✓ dry-run detects task dir + ✓ task directory actually removed + + Detect Stale Heartbeats + ✓ dry-run detects stale heartbeat + + Error Handling + ✓ unknown option exits non-zero + ✓ unknown option shows error + + Summary Counting + ✓ summary shows found count + ✓ summary shows --force hint + + + + ────────────────────────────────────────── + + All 24 tests passed + + + + Shipwright Code Review Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ review runs and starts reviewing + ✓ trends with no data + ✓ config show outputs valid config + ✓ unknown subcommand exits nonzero + ✓ no false long function detection on small file + ✓ style consistency check runs without crash + ✓ architecture boundary check runs without crash + + + + ────────────────────────────────────────── + + All 0 tests passed ╔═══════════════════════════════════════════════════╗ @@ -560,24 +698,24 @@ All 16 tests passed! Setting up test environment... -Identity Resolution +Identity Resolution ▸ resolve_developer_id from DEVELOPER_ID env... ✓ ▸ resolve_developer_id from git config... ✓ ▸ resolve_developer_id fallback to USER... ✓ ▸ resolve_machine_name from MACHINE_NAME env... ✓ ▸ resolve_machine_name from hostname... ✓ -Dashboard URL Resolution +Dashboard URL Resolution ▸ resolve_dashboard_url from --url flag... ✓ ▸ resolve_dashboard_url from DASHBOARD_URL env... ✓ ▸ resolve_dashboard_url from team-config.json... ✓ ▸ resolve_dashboard_url falls back to default... ✓ -Start/Stop Lifecycle +Start/Stop Lifecycle ▸ cmd_start creates PID file... ▸ Starting connect to http://localhost:8767 ▸ Developer: test-developer @ test-machine -✓ Connect started (PID 1279947) - Logs: /tmp/sw-connect-test.OESvLP/home/.shipwright/connect.log +✓ Connect started (PID 77309) + Logs: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-connect-test.0FwUq6/home/.shipwright/connect.log Stop: shipwright connect stop ✓ ▸ cmd_start rejects if already running... Stop it first: shipwright connect stop @@ -587,29 +725,3230 @@ All 16 tests passed! ▸ cmd_stop handles missing PID gracefully... ⚠ Connect is not running (no PID file) ✓ -Status +Status ▸ cmd_status shows connected when PID alive... ✓ ▸ cmd_status shows disconnected when no PID... ✓ -Join Flow +Join Flow ▸ cmd_join verifies token against dashboard... ✓ ▸ cmd_join saves team-config.json... ✓ ▸ cmd_join rejects invalid token... ✓ ▸ cmd_join accepts --url and --token flags... ✓ -Heartbeat & Disconnect Payloads +Heartbeat & Disconnect Payloads ▸ Heartbeat payload includes required fields... ✓ ▸ Send disconnect sends proper payload... ✓ -Configuration & Utilities +Configuration & Utilities ▸ ensure_dir creates shipwright directory... ✓ ▸ now_iso returns valid ISO timestamp... ✓ ▸ Script has correct version... ✓ -Integration +Integration ▸ Help command shows all main commands... ✓ ════════════════════════════════════════════════════  All 25 tests passed ✓ ════════════════════════════════════════════════════ + + Context Budget Monitor Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Budget Initialization + ✓ init creates config (exit 0) + ✓ config has correct total_budget + ✓ system_reserve is 10% (80000) + ✓ tools_reserve is 10% (80000) + ✓ working_memory is 60% (480000) + ✓ output_reserve is 20% (160000) + + Token Estimation + ✓ estimation produces JSON + ✓ estimate contains utilization_percent + ✓ estimate contains total_used (>0) + + Status Checking (Thresholds) + ✓ 50% utilization returns green status + ✓ 70% utilization returns yellow status + ✓ 85% utilization returns red status + ✓ 95% utilization returns critical status + + Context Trimming + ✓ green status doesn't trim content + ✓ yellow status reduces content length + ✓ hard truncate respects size limit + + Iteration Summarization + ✓ summarize_iteration succeeds (exit 0) + ✓ iteration summary file created with 1 entry + + Budget Report Generation + ✓ report contains budget_config + ✓ report contains iteration_summaries + + State Logging + ✓ log_state succeeds (exit 0) + ✓ log file created with entries + + + ────────────────────────────────────────── + + All 22 tests passed + + + + Shipwright Context Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows commands + ✓ --help exits 0 + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + + show subcommand + ✓ show exits 0 + ✓ show outputs context header + + clear subcommand + ✓ clear exits 0 + ✓ clear confirms cleared + + gather subcommand + ✓ gather without args exits 1 + ✓ gather shows must provide + ✓ gather with unknown option exits 1 + + gather with goal + ✓ gather exits 0 + ✓ gather shows building + ✓ gather shows success + ✓ context-bundle.md created + + show after gather + ✓ show after gather exits 0 + ✓ show contains pipeline context + + clear after gather + ✓ clear exits 0 + + gather with issue + ✓ gather with --issue exits 0 + ✓ gather shows building + + gather default stage + ✓ gather default stage exits 0 + + internal stage_guidance + ✓ stage_guidance returns plan guidance + ✓ stage_guidance returns build guidance + ✓ stage_guidance handles unknown stage + + + + ────────────────────────────────────────── + + All 26 tests passed + + + + Shipwright Cost Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help / version + ✓ help exits 0 + ✓ help shows USAGE + ✓ help shows COMMANDS + ✓ help mentions show + ✓ help mentions budget + ✓ help mentions calculate + ✓ VERSION variable defined + + state management + ✓ costs.json created on first use + ✓ budget.json created on first use + ✓ costs.json has entries array + ✓ budget.json has daily_budget_usd + + budget commands + ✓ budget set exits 0 + ✓ budget set to 50 + ✓ budget show exits 0 + + error handling + ✓ Unknown command exits non-zero + + calculate + ✓ calculate exits 0 + + script safety + ✓ Uses set -euo pipefail + ✓ ERR trap is set + + context efficiency in cost dashboard + ✓ Cost dashboard has CONTEXT EFFICIENCY section + ✓ Cost dashboard reads loop.context_efficiency events + ✓ Context efficiency reports utilization and waste + ✓ Dashboard renders CONTEXT EFFICIENCY with event data + ✓ Dashboard shows avg budget utilization + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright daemon test — Unit Tests (Synthetic Events) ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up test environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-daemon-test.f8LpIf + + ▸ dora_grade deploy_freq Elite (>= 7)... ✓ + ▸ dora_grade deploy_freq High (>= 1)... ✓ + ▸ dora_grade deploy_freq Medium (>= 0.25)... ✓ + ▸ dora_grade deploy_freq Low (< 0.25)... ✓ + ▸ dora_grade CFR thresholds (Elite/High/Medium/Low)... ✓ + ▸ Stage timings filter-first jq query... ✓ + ▸ MTTR pairs failures with next success... ✓ + ▸ epoch_to_iso helper function... ✓ + ▸ Health check detects stale jobs... scripts/sw-daemon-test.sh: line 120: 80318 Terminated: 15 sleep 300 +✓ + ▸ Priority label sorting... ✓ + ▸ Degradation alert triggers on high CFR... ✓ + ▸ Metrics --json output with cycle_time keys... ✓ + ▸ Self-labeling includes watch_label when enabled... ✓ + ▸ Self-labeling excludes watch_label when disabled... ✓ + ▸ Patrol recurring failures label construction... ✓ + ▸ DORA degradation event detection... ✓ + ▸ Retry exhaustion event detection... ✓ + ▸ Untested script detection logic... ✓ + ▸ Progress detects stage advancement... ✓ + ▸ Progress detects stuck (no change N checks)... ✓ + ▸ Progress detects repeated error loop... ✓ + ▸ Progress resets on diff growth... ✓ + ▸ Hard limit kills even with progress on... scripts/sw-daemon-test.sh: line 200: 82303 Terminated: 15 sleep 300 +✓ + ▸ Adaptive cycles extends limit on >50% issue drop... ✓ + ▸ Adaptive cycles reduces limit on issue increase... ✓ + ▸ Adaptive cycles respects 2x base hard ceiling... ✓ + ▸ Adaptive cycles no-op on first cycle... ✓ + ▸ Cleanup: Checkpoint expire removes old checkpoints... ▸ Expired: build checkpoint (1h+ old) +✓ Expired 1 checkpoint(s) older than 1h +✓ + ▸ Cleanup: Failure handler removes watch label... ✓ + ▸ Cleanup: Failure handler closes draft PRs... ✓ + ▸ Cleanup: sw-cleanup.sh has all artifact cleanup sections... ✓ + ▸ Daemon sources vitals module... ✓ + ▸ Vitals verdict maps to daemon verdict (continue→healthy etc)... ✓ + ▸ Vitals emits pipeline.vitals_check events... ✓ + ▸ Auto-scale includes vitals health factor... ✓ + ▸ Quality memory drives template selection... ✓ + ▸ Vitals-based progress with static fallback... ✓ + ▸ Memory: query fix for error returns matching fix... ✓ + ▸ Memory: DORA template escalation patterns exist... ✓ + ▸ Memory: All 12 error categories in post-tool-use.sh... ✓ + ▸ Daemon: Template weights selection reads weights file... ✓ + ▸ Daemon: Auto-enable self_optimize when auto_template is true... ✓ + ▸ Intelligence: classify_failure detects auth errors... ✓ + ▸ Intelligence: classify_failure has all 6 failure classes... ✓ + ▸ Intelligence: Retry skips auth_error and invalid_issue... ✓ + ▸ Intelligence: API errors get extended 300s backoff... ✓ + ▸ Intelligence: daemon_preflight_auth_check exists and auto-pauses... ✓ + ▸ Intelligence: Process group spawning (set -m)... ✓ + ▸ Intelligence: Process tree kill in cleanup (pkill -P)... ✓ + ▸ Intelligence: Consecutive failure auto-pause (3 threshold)... ✓ + ▸ Intelligence: Retry escalation args passed to spawn... ✓ + ▸ Intelligence: classify_failure wired into retry logic... ✓ + +━━━ Results ━━━ + Passed: 66 + Failed: 0 + Total: 52 + +All 66 tests passed! + + +Dashboard Smoke Tests + +Structure + ▸ server.ts exists... ✓ + ▸ server.ts has valid structure (imports, fetch)... ✓ + ▸ public/ directory exists... ✓ + ▸ index.html exists... ✓ + ▸ src/ directory exists... ✓ + ▸ main.ts entry point exists... ✓ + ▸ tsconfig.json exists... ✓ + ▸ src/ modules exist (core, views, components, design, types, canvas)... ✓ + +Routes + ▸ Server exports /api/health and /api/status... ✓ + ▸ Server exports /ws WebSocket route... ✓ + ▸ Server exports /api/context-efficiency... ✓ + +Integrity + ▸ bun check passes (if bun available)... ✓ + ▸ index.html references dist/main.js... ✓ + ▸ Frontend bundle builds (if bun available)... ✓ + ▸ TypeScript type check passes (if tsc available)... (typescript not installed, skipping) +✓ + +════════════════════════════════════════════════════ + All 15 tests passed ✓ +════════════════════════════════════════════════════ + + +╔════════════════════════════════════════════════════════╗ +║ shipwright db — SQLite Persistence Test Suite ║ +╚════════════════════════════════════════════════════════╝ + +Setting up test environment... + +Schema Creation + ▸ init_schema creates all tables... ✓ + ▸ Schema includes required tables... ✓ + ▸ WAL mode is enabled... ✓ + ▸ check_sqlite3 caches result... ✓ + +Database Availability + ▸ db_available returns true when ready... ✓ + ▸ db_available returns false without sqlite3... /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.TngspM/sw-db.sh: line 19: dirname: command not found +✓ + +Event CRUD Operations + ▸ db_add_event inserts event... ✓ + ▸ add_event dual-writes to SQLite + JSONL... ✓ + +Daemon State Management + ▸ db_save_job saves active job... ✓ + ▸ db_complete_job marks completed... ✓ + ▸ db_fail_job marks failed... ✓ + ▸ db_list_active_jobs returns JSON array... ✓ + ▸ db_active_job_count returns count... ✓ + ▸ db_is_issue_active checks status... ✓ + +Cost Tracking + ▸ db_record_cost saves entry... ✓ + ▸ db_cost_today calculates total... ✓ + ▸ db_set_budget and db_get_budget... ✓ + +Heartbeat Management + ▸ db_record_heartbeat saves heartbeat... ✓ + ▸ db_list_heartbeats returns array... ✓ + ▸ db_clear_heartbeat deletes entry... ✓ + +Memory & Failure Tracking + ▸ db_record_failure saves pattern... ✓ + ▸ db_query_similar_failures finds matches... ✓ + +JSON Data Migration + ▸ migrate_json_data imports events... ▸ Migrating schema v0 → v2... +✓ Migrated to schema v2 +▸ Migrating schema v0 → v3... +✓ Migrated to schema v3 +▸ Migrating schema v0 → v4... +✓ Migrated to schema v4 +▸ Migrating schema v0 → v5... +✓ Migrated to schema v5 +▸ Migrating schema v0 → v6... +✓ Migrated to schema v6 +▸ Importing events from /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.TngspM/home/.shipwright/events.jsonl... +✓ Events: 2 imported, 0 skipped (duplicates) +▸ Importing heartbeats... +✓ Heartbeats: 0 imported + +✓ Migration complete: 2 total records imported + +▸ Verification: + Events in DB: 2 + Cost entries: 0 + Heartbeats: 0 +✓ + ▸ migrate_json_data imports costs... ▸ Database already at schema v6 +▸ Importing events from /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.TngspM/home/.shipwright/events.jsonl... +✓ Events: 2 imported, 0 skipped (duplicates) +▸ Importing costs from /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.TngspM/home/.shipwright/costs.json... +✓ Costs: 2 entries imported +▸ Importing heartbeats... +✓ Heartbeats: 0 imported + +✓ Migration complete: 4 total records imported + +▸ Verification: + Events in DB: 2 + Cost entries: 2 + Heartbeats: 0 +✓ + ▸ migrate_json_data imports budget... ▸ Database already at schema v6 +▸ Importing events from /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.TngspM/home/.shipwright/events.jsonl... +✓ Events: 2 imported, 0 skipped (duplicates) +▸ Importing costs from /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.TngspM/home/.shipwright/costs.json... +✓ Costs: 2 entries imported +▸ Importing budget from /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.TngspM/home/.shipwright/budget.json... +✓ Budget: imported ($50.00, enabled=true) +▸ Importing heartbeats... +✓ Heartbeats: 0 imported + +✓ Migration complete: 4 total records imported + +▸ Verification: + Events in DB: 2 + Cost entries: 2 + Heartbeats: 0 +✓ + +Health Checks + ▸ db_health_check validates database... ✓ + +Export & Backup + ▸ export_db creates JSON backup... ▸ Exporting database to /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.TngspM/home/.shipwright/test-backup.json... +✓ Database exported to /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.TngspM/home/.shipwright/test-backup.json +✓ + +Data Cleanup + ▸ cleanup_old_data removes old entries... ▸ Cleaning records older than 30 days (before 2026-02-07T10:54:40Z)... +✓ Deleted: 0 events, 0 costs, 0 daemon jobs, 0 stages +✓ + +Concurrent Operations + ▸ Concurrent writes don't corrupt DB... 5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +✓ + +Pipeline Run Tracking + ▸ add_pipeline_run creates entry... ✓ + ▸ update_pipeline_status updates run... ✓ + +══════════════════════════════════════════════════════════ + All 31 tests passed ✓ +══════════════════════════════════════════════════════════ + + + + Shipwright Decompose Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows commands + ✓ --help exits 0 + + version flag + ✓ --version exits 0 + ✓ --version shows version + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + ✓ analyze without issue exits 1 + ✓ analyze shows usage + ✓ decompose without issue exits 1 + ✓ auto without issue exits 1 + + analyze subcommand (mock) + ✓ analyze exits 0 with NO_GITHUB + ✓ analyze outputs complexity_score + ✓ analyze outputs should_decompose + ✓ analyze outputs subtasks + ✓ analyze outputs valid JSON + ✓ analyze returns complexity_score 85 + ✓ analyze returns should_decompose true + + decompose subcommand (mock) + ✓ decompose exits 0 with NO_GITHUB + ✓ decompose shows decomposing + + auto subcommand (mock) + ✓ auto exits 0 with NO_GITHUB + + state file creation + ✓ events.jsonl created + + DAG scheduling (new features) + ✓ schedule exits 0 + ✓ schedule shows valid DAG + ✓ schedule shows waves + ✓ critical-path exits 0 + ✓ critical-path shows title + ✓ critical-path shows hours + ✓ visualize text exits 0 + ✓ visualize shows DAG title + ✓ visualize shows task 0 + ✓ visualize mermaid exits 0 + ✓ visualize mermaid has graph + ✓ help shows schedule cmd + ✓ help shows critical-path cmd + ✓ help shows visualize cmd + ✓ version shows 3.2.4 + ✓ mock data includes depends_on field + + + + ────────────────────────────────────────── + + All 39 tests passed + + + sw-decide Tests + ══════════════════════════════════════════ + + + help + ✓ help shows usage + ✓ help mentions run + ✓ help mentions status + ✓ help mentions tiers + ✓ help mentions candidates + ✓ help mentions halt + ✓ help mentions resume + ✓ help mentions dry-run + ✓ --help shows usage + + tiers + ✓ tiers shows auto + ✓ tiers shows propose + ✓ tiers shows draft + ✓ tiers shows category rules + ✓ tiers shows limits + + signals + ✓ candidate has id + ✓ candidate has signal + ✓ candidate has category + ✓ candidate has risk_score + ✓ candidate has dedup_key + ✓ candidate has collected_at + ✓ candidate is valid JSON + ✓ read_pending returns data + ✓ clear_pending empties file + + scoring + ✓ scored has value_score + ✓ scored has scores object + ✓ critical security scores well (75) + ✓ security scores higher than dead_code (75 > 33) + ✓ impact weight loaded + ✓ urgency weight loaded + + autonomy + ✓ deps_patch -> auto + ✓ refactor_hotspot -> propose + ✓ new_feature -> draft + ✓ unknown -> draft + ✓ auto labels include shipwright + ✓ auto labels include ready-to-build + ✓ propose labels include proposed +jq: parse error: Unmatched '}' at line 1, column 111 +jq: parse error: Unmatched '}' at line 1, column 111 + ✓ budget available with no decisions +jq: parse error: Unmatched '}' at line 1, column 111 + ✓ budget exhausted after 16 issues + ✓ rate limit passes with no history +jq: parse error: Unmatched '}' at line 1, column 111 + ✓ rate limit blocks recent decision + ✓ not halted initially + ✓ halted after halt() + ✓ halt file created + ✓ resumed after resume() + + risk ceiling + ✓ risk 20 below ceiling 30 + ✓ risk 35 above ceiling 30 + + dry-run + ✓ dry-run shows DRY RUN + ✓ no daily log created in dry-run + ✓ no drafts in dry-run + + decision log + ✓ daily log exists + ✓ log has decision id + ✓ log has value_score + ✓ last-decision written + + outcome learning + ✓ urgency weight adjusted on security success (25) + ✓ risk weight increased on failure (13 >= 13) + ✓ weights sum to 100 + ✓ weights file written + + candidates + ✓ candidates shows title + ✓ candidates shows signal + + halt/resume CLI + ✓ halt succeeds + ✓ run blocked when halted + ✓ resume succeeds + + status + ✓ status shows active + ✓ status shows decisions + ✓ status shows budget + ✓ status shows weights + + cycle integration + ✓ cycle shows Decision Engine + ✓ cycle shows Cycle Complete + ✓ cycle processes candidates + ✓ cycle shows AUTO tier + ✓ cycle shows DRAFT tier + + log command + ✓ log shows today's date + ✓ log shows entry + + ────────────────────────────────────────── + + All 33 tests passed + + + + Shipwright Deps Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows commands + ✓ --help exits 0 + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + + scan subcommand + ✓ scan exits 0 with NO_GITHUB + ✓ scan shows warning + + classify subcommand + ✓ classify without args exits 1 + ✓ classify shows usage + ✓ classify exits 0 with NO_GITHUB + ✓ classify shows warning + + batch subcommand + ✓ batch exits 0 with NO_GITHUB + ✓ batch shows warning + + report subcommand + ✓ report exits 0 with NO_GITHUB + ✓ report shows warning + + merge subcommand + ✓ merge without args exits 1 + + test subcommand + ✓ test without args exits 1 + + internal parse_version_bump + ✓ parse_version_bump detects patch + ✓ parse_version_bump detects minor + ✓ parse_version_bump detects major + ✓ parse_version_bump handles v prefix + + + + ────────────────────────────────────────── + + All 22 tests passed + + + + shipwright developer-simulation test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + +Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ source guard uses if/then/fi pattern + +Version + ✓ VERSION variable defined + +Help Output + ✓ help contains USAGE + ✓ help contains review subcommand + ✓ help contains address subcommand + ✓ help contains PERSONAS + ✓ help contains simulation_enabled + ✓ --help flag works + +Error Handling + ✓ unknown command exits non-zero + ✓ unknown command mentions error + +Review Subcommand + ✓ review warns when disabled + ✓ review returns empty JSON array when disabled + +Address Subcommand + ✓ address with no objections succeeds + ✓ address returns empty JSON + +Persona Definitions + ✓ security persona defined + ✓ performance persona defined + ✓ maintainability persona defined + +Configuration + ✓ SIMULATION_MAX_ROUNDS env var supported + ✓ daemon-config.json checked + +Event Emission + ✓ emits simulation.objection events + ✓ emits simulation.complete events + ✓ emits simulation.addressed events + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Discovery Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows commands + ✓ --help exits 0 + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + ✓ broadcast without args exits 1 + ✓ query without args exits 1 + ✓ inject without args exits 1 + + broadcast subcommand + ✓ broadcast exits 0 + ✓ broadcast confirms + ✓ discoveries.jsonl created + ✓ discoveries.jsonl contains valid JSON + + query subcommand + ✓ query exits 0 + ✓ query finds discovery + ✓ query non-match exits 0 + ✓ query reports no discoveries + + status subcommand + ✓ status exits 0 + ✓ status shows total + + clean subcommand + ✓ clean exits 0 + ✓ clean reports result + + inject subcommand + ✓ inject exits 0 + + internal patterns_overlap + ✓ patterns_overlap matches same pattern + ✓ patterns_overlap rejects different paths + + prioritize subcommand + ✓ prioritize security exits 0 + ✓ prioritize assigns P0 + ✓ prioritize assigns P3 + + score subcommand + ✓ score exits 0 + ✓ score returns numeric result: 89 + + acknowledge subcommand + ✓ acknowledge exits 0 + ✓ consumption file created + ✓ consumption count incremented + + consumption stats + ✓ consumption stats valid JSON + ✓ consumption count correct + + memory promotion + ✓ promotion function exists + + fleet broadcast + ✓ fleet broadcast function exists + + + + ────────────────────────────────────────── + + All 36 tests passed + + + + Shipwright Doc Fleet Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help + ✓ help shows title + ✓ help shows commands section + ✓ help shows fleet roles section + ✓ help shows examples + ✓ --help flag works + ✓ unknown command shows error + Roles + ✓ roles lists doc-architect + ✓ roles lists claude-md + ✓ roles lists strategy-curator + ✓ roles lists pattern-writer + ✓ roles lists readme-optimizer + Audit + ✓ audit shows health header + ✓ audit shows health score + ✓ audit checks doc inventory + ✓ audit checks CLAUDE.md + ✓ audit checks agent roles + ✓ audit creates state file + ✓ audit records health score (91%) + Launch + ✓ launch dry-run shows header + ✓ launch dry-run mentions dry-run + ✓ launch dry-run lists doc-architect + ✓ launch dry-run lists claude-md + ✓ launch dry-run lists strategy-curator + ✓ launch dry-run lists pattern-writer + ✓ launch dry-run lists readme-optimizer + ✓ launch dry-run shows agent count + ✓ launch specific role shows role + ✓ launch specific role shows 1 agent + ✓ launch invalid role shows error + Status + ✓ status shows header + ✓ status shows last run + ✓ status shows health score + ✓ status shows session list + Manifest + ✓ manifest shows generation + ✓ manifest file created + ✓ manifest has documents (16) + Report + ✓ report shows header + ✓ report shows inventory + ✓ report shows volume + ✓ report shows fleet state + ✓ report json shows JSON + ✓ JSON report file created + Retire + ✓ retire shows retiring + ✓ retire shows count + State Persistence + ✓ status shows run count after commands + Events + ✓ doc_fleet events logged (8 events) + Aliases + ✓ start alias works + ✓ stop alias works + + + + ────────────────────────────────────────── + + All 48 tests passed + + + + Shipwright Docs Agent Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help + ✓ help shows usage + ✓ help shows commands + ✓ --help flag works + ✓ unknown command shows error + Coverage + ✓ coverage shows header + ✓ coverage shows total scripts + API Reference + ✓ api generation succeeds + ✓ api file has title + Wiki + ✓ wiki generation succeeds + ✓ wiki directory created + Scan + ✓ scan shows scanning + Sync + ✓ sync shows sync complete + Impact + ✓ impact shows analysis + State + ✓ docs-agent home directory exists + + + + ────────────────────────────────────────── + + All 14 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright docs — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +AUTO Section Discovery + ▸ find_auto_files discovers CLAUDE.md... PASS + ▸ get_sections extracts section IDs... PASS + +Section Generators + ▸ Architecture table has headers and scripts... PASS + ▸ Architecture table includes CLI router... PASS + ▸ Feature flags table with intelligence config... PASS + ▸ Test suites table picks up test files... PASS + +Section Check & Replace + ▸ check_section detects fresh vs stale... PASS + ▸ replace_section updates content between markers... PASS + +Subcommands + ▸ docs check returns 1 when stale... PASS + ▸ docs sync then check returns 0 (fresh)... PASS + ▸ docs sync is idempotent... PASS + +CLI & Help + ▸ Help output contains all subcommands... PASS + ▸ Unknown command exits with error... PASS + ▸ Default (no args) shows help... PASS + +Edge Cases + ▸ No AUTO markers returns 0... PASS + ▸ Multiple sections all get processed... PASS + ▸ Purpose extracted from script headers... PASS + ▸ Wiki dry-run succeeds... PASS + + ────────────────────────────────────────── + 18 passed 0 failed (18 total) + + + + Shipwright Doctor Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + execution + ✓ doctor runs without crash + ✓ output shows PREREQUISITES + ✓ detects tmux + ✓ detects jq + ✓ detects Claude Code CLI + ✓ detects git + + structure + ✓ VERSION variable defined + ✓ Uses set -euo pipefail + ✓ ERR trap is set + ✓ check_pass helper defined + ✓ check_fail helper defined + ✓ output shows Shipwright header + + check logic for tools + ✓ Source checks for tmux + ✓ Source checks for jq + ✓ Source checks for Claude CLI + ✓ Source checks for git + ✓ Source checks for gh + + version flag + ✓ --version outputs sw-doctor and version + ✓ -V short flag works + + missing tool handling + ✓ Doctor reports when jq missing from PATH + ✓ output includes PREREQUISITES section + ✓ output includes INSTALLED FILES section + + auto-fix mode + ✓ --fix-dry flag shows dry-run output +✗ Overlay not found: ~/.tmux/shipwright-overlay.conf +✗ No ~/.claude/settings.json — agent teams not configured +✗ Overlay not found — pane display features unavailable +✗ iTerm2 mouse reporting is DISABLED — tmux cannot receive mouse clicks + ✓ --fix creates .claude directories + ✓ --fix creates ~/.shipwright directories + ✓ --fix creates valid daemon-config.json + ✓ --fix creates valid settings.json + ✓ --fix creates valid budget.json +✗ No ~/.claude/settings.json — agent teams not configured +✗ iTerm2 mouse reporting is DISABLED — tmux cannot receive mouse clicks + ✓ --fix is idempotent (second run succeeds) +✗ No ~/.claude/settings.json — agent teams not configured +✗ iTerm2 mouse reporting is DISABLED — tmux cannot receive mouse clicks + ✓ --fix handles existing config files safely + ✓ --fix without other args works + ✓ Auto-fix output shows AUTO-FIX SUMMARY + ✓ Auto-fix reports what was fixed + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + shipwright dora test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + +Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ source guard uses if/then/fi pattern + +Version + ✓ VERSION variable defined + +Help Output + ✓ help contains USAGE + ✓ help contains show subcommand + ✓ help contains dx subcommand + ✓ help contains ai subcommand + ✓ help contains trends subcommand + ✓ help contains compare subcommand + ✓ help contains export subcommand + ✓ help contains DORA BANDS + ✓ --help flag works + +Error Handling + ✓ unknown command exits non-zero + ✓ unknown command error message + +Show Subcommand (no events) + ✓ show displays DORA Metrics + ✓ show displays Deploy Frequency + ✓ show displays Lead Time + ✓ show displays Change Failure Rate + ✓ show displays MTTR + +DX Subcommand + ✓ dx displays Developer Experience + +AI Subcommand + ✓ ai displays AI Performance Metrics + +Export Subcommand + ✓ export produces JSON with timestamp + ✓ export contains current_period + ✓ export contains previous_period + +Trends Subcommand + ✓ trends displays Trends heading + +Compare Subcommand + ✓ compare displays Period Comparison + +DORA Band Classification + ✓ classify_band function defined + ✓ Elite band classification + ✓ High band classification + ✓ Medium band classification + ✓ Low band classification + +Trend Arrows + ✓ trend_arrow function defined + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Durable Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help / version + ✓ help exits 0 + ✓ help shows USAGE + ✓ help mentions publish + ✓ help mentions consume + ✓ help mentions checkpoint + ✓ help mentions lock + ✓ help mentions compact + ✓ help mentions status + ✓ VERSION variable defined + + publish events + ✓ publish exits 0 + ✓ WAL file created + ✓ Event written to WAL + + error handling + ✓ publish without args exits non-zero + ✓ Unknown command exits non-zero + + status command + ✓ status exits 0 + + checkpointing + ✓ checkpoint save exits 0 + ✓ Checkpoint file created + ✓ checkpoint restore exits 0 + + distributed locks + ✓ lock acquire exits 0 + ✓ lock release exits 0 + + script safety + ✓ Uses set -euo pipefail + ✓ Has source guard pattern + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright E2E Orchestrator Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ unknown command exits nonzero + ✓ registry file created on init + ✓ registry is valid JSON + ✓ registry has >= 3 default suites + ✓ register adds suite + ✓ duplicate register fails + ✓ quarantine adds test + ✓ quarantine list has entry + ✓ report handles no results + ✓ flaky handles no history + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Eventbus Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows subcommands + ✓ --help exits 0 + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + + status subcommand + ✓ status exits 0 with empty bus + ✓ status shows title + + publish subcommand + ✓ publish exits 0 + ✓ publish confirms + ✓ events.jsonl created + ✓ events.jsonl contains published event type + ✓ events.jsonl contains correlation_id + ✓ eventbus has 3+ events after multi-publish + + status with events + ✓ status with events exits 0 + ✓ status shows total events + ✓ status shows events by type + + clean subcommand + ✓ clean exits 0 + ✓ clean reports result + + replay subcommand + ✓ replay exits 0 + ✓ replay shows replaying + + publish error handling + ✓ publish with empty type exits 1 + ✓ watch with missing dir exits 1 + ✓ watch shows dir error + + + + ────────────────────────────────────────── + + All 24 tests passed + + + + Shipwright Feedback Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows subcommands + ✓ --help exits 0 + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + + collect subcommand + ✓ collect on empty dir exits 0 + ✓ collect shows collecting + ✓ collect shows save path + + collect with error log + ✓ collect with errors exits 0 + ✓ collect reports errors + + analyze subcommand + ✓ analyze missing file exits 1 + ✓ analyze shows not found + ✓ analyze exits 0 + ✓ analyze shows report + + learn subcommand + ✓ learn exits 0 + ✓ learn confirms capture + ✓ incidents.jsonl created + ✓ incidents.jsonl has valid JSONL + + report subcommand + ✓ report exits 0 + ✓ report shows incidents + ✓ report shows total + ✓ report no incidents exits 0 + ✓ report says no incidents + + create-issue subcommand + ✓ create-issue with NO_GITHUB exits 0 + ✓ create-issue skips with NO_GITHUB + + post-merge monitoring + ✓ post-merge exits 0 + ✓ post-merge shows monitoring + ✓ post-merge creates monitoring file + ✓ monitoring has merge_sha + ✓ monitoring has environment + + regression detection + ✓ regression detection outputs valid JSON + ✓ no regression flag + ✓ deploy failure detects regression + ✓ deploy failure is P0 + ✓ deploy failure type + ✓ error spike detects regression + ✓ error spike is P1 + ✓ error spike type + + correlate with changes + ✓ correlate outputs valid JSON + ✓ correlation has pr_number + + learn from outcome + ✓ outcomes exits 0 + ✓ outcomes shows recording + ✓ outcomes creates merge-outcomes file + ✓ outcomes file has valid JSONL + ✓ outcome has pr_number + + health report + ✓ health with no data exits 0 + ✓ health shows no data message + ✓ health shows statistics + ✓ health shows success rate + ✓ health shows regressions + + integrated post-merge workflow + ✓ workflow: monitoring complete + ✓ workflow: regression detection valid + ✓ workflow: correlation valid + ✓ workflow: outcome recorded + + + + ────────────────────────────────────────── + + All 55 tests passed + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright fix test — Unit Tests ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up test environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-fix-test.ZXqPJ8 + + ▸ Help output contains expected sections... ✓ + ▸ Help via --help flag... ✓ + ▸ Missing goal shows error... ✓ + ▸ Missing repos shows error... ✓ + ▸ Arg parsing — --repos comma-separated... ✓ + ▸ Arg parsing — --repos-from file... ✓ + ▸ Arg parsing — --repos-from missing file... ✓ + ▸ Arg parsing — --pipeline template... ✓ + ▸ Arg parsing — --max-parallel... ✓ + ▸ Arg parsing — --branch-prefix... ✓ + ▸ Dry run shows what would happen... ✓ + ▸ Dry run does not create state file... ✓ + ▸ Fix status — no sessions... ✓ + ▸ Fix status shows existing sessions... ✓ + ▸ Invalid repo directory shows error... ✓ + ▸ Fix start creates state file... ✓ + ▸ Fix start emits events... ✓ + ▸ Fix start — summary output... ✓ + ▸ Branch name sanitization... ✓ + ▸ Fix header shows configuration... ✓ + ▸ Non-git repo warning... ✓ + ▸ Per-repo event tracking... ✓ + +━━━ Results ━━━ + Passed: 22 + Failed: 0 + Total: 22 + +All 22 tests passed! + + + + shipwright fleet-discover test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + +Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ source guard uses if/then/fi pattern + +Version + ✓ VERSION variable defined + +Help Output + ✓ help contains USAGE + ✓ help contains --org option + ✓ help contains --language option + ✓ help contains --dry-run option + ✓ help contains --json option + ✓ help contains --topic option + ✓ help contains --exclude-topic option + ✓ help contains --min-activity-days + +Argument Validation + ✓ missing --org exits non-zero + ✓ missing --org error message + +Error Handling + ✓ unknown option exits non-zero + ✓ unknown option error + +NO_GITHUB Check + ✓ NO_GITHUB blocks discovery + ✓ NO_GITHUB shows error + +Argument Parsing + ✓ supports --org=value syntax + ✓ supports --config=value syntax + ✓ supports --language=value syntax + ✓ supports --topic=value syntax + +Config Merge + ✓ merge added new repo + ✓ merge preserved existing repo + +Filter Logic + ✓ filters archived repos + ✓ filters disabled repos + ✓ checks has_issues + ✓ language filter applied + ✓ topic filter applied + ✓ exclude topic filter + ✓ checks .shipwright-ignore + +Event Emission + ✓ emits fleet.discover.completed event + ✓ emits fleet.discover.merged event + + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright fleet test — Unit Tests ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up test environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-fleet-test.giskRE + + ▸ Help output contains expected sections... ✓ + ▸ Help via --help flag... ✓ + ▸ Config parsing — valid config... ✓ + ▸ Config parsing — missing config file... ✓ + ▸ Config parsing — invalid JSON... ✓ + ▸ Config parsing — empty repos array... ✓ + ▸ Config defaults applied... ✓ + ▸ Fleet init generates config template... ✓ + ▸ Fleet init skips when config exists... ✓ + ▸ Fleet start spawns tmux sessions per repo... ✓ + ▸ Fleet start skips missing repos... ✓ + ▸ Fleet start skips existing sessions... ✓ + ▸ Fleet start creates fleet state file... ✓ + ▸ Fleet start emits fleet.started event... ✓ + ▸ Fleet start applies repo-level overrides... ✓ + ▸ Fleet stop kills sessions and cleans state... ✓ + ▸ Fleet stop — no fleet running... ✓ + ▸ Fleet stop emits fleet.stopped event... ✓ + ▸ Fleet status — no fleet running... ✓ + ▸ Fleet status shows dashboard... ✓ + ▸ Fleet metrics — no events file... ✓ + ▸ Fleet metrics dashboard output... ✓ + ▸ Fleet metrics JSON output... ✓ + ▸ Fleet metrics period flag... ✓ + ▸ Session name generation... ✓ + ▸ Fleet start skips non-git repos... ✓ + ▸ Unknown subcommand... ✓ + +━━━ Results ━━━ + Passed: 27 + Failed: 0 + Total: 27 + +All 27 tests passed! + + + + shipwright fleet-viz test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + +Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ source guard uses if/then/fi pattern + +Version + ✓ VERSION variable defined + +Help Output + ✓ help contains USAGE + ✓ help contains overview subcommand + ✓ help contains workers subcommand + ✓ help contains insights subcommand + ✓ help contains queue subcommand + ✓ help contains costs subcommand + ✓ help contains export subcommand + ✓ --help flag works + +Error Handling + ✓ unknown command exits non-zero + ✓ unknown command error message + +Overview Subcommand + ✓ overview shows Fleet Overview + ✓ overview shows Active count + ✓ overview shows Queued count + ✓ overview shows Repos count + +Workers Subcommand + ✓ workers shows Worker Allocation + ✓ workers shows Remote Machines + +Queue Subcommand + ✓ queue shows Issue Queue + ✓ queue shows queued items + +Costs Subcommand + ✓ costs shows Fleet Costs + ✓ costs shows Total Spend + ✓ costs shows Per-Repo + ✓ costs shows Per-Model + +Export Subcommand + ✓ export produces JSON with active_jobs + +Insights Subcommand + ✓ insights shows Fleet Insights + ✓ insights shows Success Rate + +Default Command + ✓ default command shows Fleet Overview + +Health Helpers + ✓ get_health_status function defined + ✓ color_health function defined + ✓ healthy status handled + ✓ degraded status handled + ✓ failing status handled + + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright frontier — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Adversarial Review + ▸ Adversarial review produces structured findings... ✓ + ▸ Adversarial iteration converges on no critical findings... ✓ + +Developer Simulation + ▸ Simulation generates objections from 3 personas... ✓ + ▸ Simulation address returns action items... ✓ + +Architecture Enforcer + ▸ Architecture model has valid schema... ✓ + ▸ Architecture validates changes (violation detected)... ✓ + +Graceful Degradation + ▸ All three degrade gracefully when claude unavailable... ✓ + +════════════════════════════════════════════════════ + All 7 tests passed ✓ +════════════════════════════════════════════════════ + + + + Shipwright GitHub App Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help + ✓ help shows usage + ✓ help shows commands + ✓ --help flag works + ✓ unknown command shows error + Status + ✓ status without config warns + Events + ✓ events with no log warns + Manifest + ✓ manifest contains app name + ✓ manifest contains webhook URL + ✓ manifest success message + ✓ manifest output is valid JSON + Configured Status + ✓ configured status shows app ID + ✓ configured status shows install ID + ✓ events shows recent events + Token + ✓ token without key file errors + Verify + ✓ verify without secret errors + + + + ────────────────────────────────────────── + + All 15 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright github-checks — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Checks API Availability + ▸ _gh_checks_available: returns true when API accessible... {"check_runs":[],"total_count":0} +✓ + ▸ _gh_checks_available: returns false on 403... ✓ + +Check Run CRUD + ▸ gh_checks_create_run: returns run ID from response... ✓ + ▸ gh_checks_create_run: handles 403 gracefully... ⚠ Failed to create check run 'test-check' (API returned 1) +ERROR: scripts/sw-github-checks-test.sh:224 exited with status 1 +✓ + ▸ gh_checks_update_run: sends correct PATCH request... ✓ + ▸ gh_checks_update_run: skips when run_id empty... ✓ + ▸ gh_checks_annotate: respects 50-annotation limit... ✓ + ▸ gh_checks_list_runs: parses response correctly... ✓ + ▸ gh_checks_complete: convenience wrapper works... ✓ + +Pipeline Integration + ▸ gh_checks_pipeline_start: creates runs for all stages... ✓ + ▸ gh_checks_stage_update: looks up stored run IDs... ✓ + +NO_GITHUB Guard + ▸ NO_GITHUB: all functions return early... ✓ + +════════════════════════════════════════════════════ + All 12 tests passed ✓ +════════════════════════════════════════════════════ + + +╔═══════════════════════════════════════════════════╗ +║ shipwright github-deploy — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Deployment CRUD + ▸ gh_deploy_create: returns deployment ID... ✓ + ▸ gh_deploy_create: handles 403 gracefully... ⚠ Failed to create deployment for ref 'main' to 'production' (API returned 1) +ERROR: scripts/sw-github-deploy-test.sh:158 exited with status 1 +✓ + ▸ gh_deploy_update_status: sends correct POST... ✓ + ▸ gh_deploy_update_status: skips when deploy_id empty... ✓ + ▸ gh_deploy_list: parses deployment list... ✓ + ▸ gh_deploy_latest: returns first result... ✓ + +Rollback + ▸ gh_deploy_rollback: creates new deployment with prev ref... ▸ Rolling back to ref: v2.9 +✓ Rolled back to v2.9 (deployment 301) +✓ + +Pipeline Integration + ▸ gh_deploy_pipeline_start: stores deployment ID... ✓ + ▸ gh_deploy_pipeline_complete: updates status correctly... ✓ + +NO_GITHUB Guard + ▸ NO_GITHUB: all functions return early... ✓ + +════════════════════════════════════════════════════ + All 10 tests passed ✓ +════════════════════════════════════════════════════ + + +━━━ shipwright github-graphql tests ━━━ + + ▸ Fresh cache returns cached data... ✓ + ▸ Stale cache returns miss... ✓ + ▸ Atomic writes leave no temp files... ✓ + ▸ gh_file_change_frequency returns count... ✓ + ▸ gh_contributors returns parsed list... ✓ + ▸ gh_similar_issues truncates long text... ✓ + ▸ gh_branch_protection handles 404... ✓ + ▸ gh_security_alerts handles 403... ✓ + ▸ gh_dependabot_alerts handles 403... ✓ + ▸ gh_repo_context aggregates data... ✓ + ▸ NO_GITHUB returns defaults... ✓ + ▸ _gh_detect_repo parses SSH URL... ✓ + ▸ _gh_detect_repo parses HTTPS URL... ✓ + ▸ _gh_detect_repo parses HTTPS without .git... ✓ + ▸ gh_codeowners parses CODEOWNERS file... ✓ + ▸ Cache clear removes all files... ✓ + ▸ gh_commit_history returns parsed commits... ✓ + ▸ gh_blame_data aggregates authors... ✓ + ▸ gh_actions_runs calculates duration... ✓ + ▸ Events emitted for cache hit/miss... ✓ + +━━━ Results ━━━ + Passed: 20 + Failed: 0 + Total: 20 + +All 20 tests passed! + + + + Shipwright Guild Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ no args shows help + ✓ list shows Available Guilds + ✓ guild config.json created + ✓ guild config is valid JSON + ✓ guilds.json data file created + ✓ show security guild + ✓ show invalid guild exits nonzero + ✓ show without name exits nonzero + ✓ add pattern succeeds + ✓ pattern saved in data file + ✓ report shows guild data + ✓ report for specific guild + ✓ inject security shows knowledge + ✓ unknown command exits nonzero + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright heartbeat + checkpoint — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Heartbeat Lifecycle + ▸ Write heartbeat creates JSON file... ✓ + ▸ Check heartbeat reports alive... ✓ Job test-job-alive alive (0s ago) +✓ + ▸ Check heartbeat reports stale... ⚠ Job test-job-stale stale (195216940s ago, timeout: 120s) +✓ + ▸ Clear heartbeat removes file... ✓ Cleared heartbeat for job: test-job-clear +✓ + ▸ List heartbeats returns JSON array... ✓ + ▸ Heartbeat update overwrites existing... ✓ + ▸ Check missing heartbeat returns error... ✓ + ▸ Heartbeat dir auto-created when missing... ✓ + +Checkpoint Lifecycle + ▸ Checkpoint save creates JSON file... ✓ Checkpoint saved for stage build (iteration 5) +✓ + ▸ Checkpoint restore outputs JSON... ✓ Checkpoint saved for stage test (iteration 3) +✓ + ▸ Checkpoint restore missing stage fails... ✓ + ▸ Checkpoint clear removes file... ✓ Checkpoint saved for stage review (iteration 1) +✓ Cleared checkpoint for stage review +✓ + ▸ Checkpoint clear --all removes all... ✓ Checkpoint saved for stage build (iteration 1) +✓ Checkpoint saved for stage test (iteration 2) +✓ Cleared 2 checkpoint(s) +✓ + ▸ Checkpoint save with files-modified... ✓ Checkpoint saved for stage build (iteration 7) +✓ + +Integration + ▸ Pipeline script has heartbeat functions... ✓ + ▸ Loop script has heartbeat and checkpoint... ✓ + ▸ Pipeline has human intervention checks... ✓ + +════════════════════════════════════════════════════ + All 17 tests passed ✓ +════════════════════════════════════════════════════ + +sw-hello-test.sh + ✓ hello command outputs 'hello world' + ✓ hello command exits with code 0 + ✓ hello --help displays help text + ✓ hello -h displays help text + ✓ hello --version displays version + ✓ hello with invalid option exits with code 1 + +PASS: 6 +FAIL: 0 + + + Shipwright Hygiene Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows subcommands + ✓ --help exits 0 + + error handling + ✓ unknown subcommand exits 1 + ✓ unknown subcommand shows error + + report subcommand + ✓ report exits 0 + ✓ report shows generating + ✓ report exits 0 + ✓ report creates JSON file + ✓ report JSON is valid + ✓ report JSON has timestamp + ✓ report JSON has sections + + structure subcommand + ✓ structure exits 0 + ✓ structure reports validating + + naming subcommand + ✓ naming exits 0 + ✓ naming shows checking + + dead-code subcommand + ✓ dead-code exits 0 + ✓ dead-code shows scanning + + dependencies subcommand + ✓ dependencies exits 0 + ✓ dependencies shows auditing + + platform-refactor subcommand + ✓ platform-refactor exits 0 + ✓ platform-refactor scans for hardcoded/fallback + ✓ platform-refactor creates platform-hygiene.json with counts + + policy read (policy_get from config) + ✓ policy_get returns value from config + ✓ policy_get returns default when key missing + + + + ────────────────────────────────────────── + + All 26 tests passed + + + + Shipwright Incident Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help / version + ✓ help exits 0 + ✓ help shows USAGE + ✓ help mentions watch + ✓ help mentions list + ✓ help mentions report + ✓ help mentions stats + ✓ VERSION variable defined + + error handling + ✓ Unknown command exits non-zero + + list command + ✓ list with no incidents exits 0 + + stats command + ✓ stats with no data exits 0 + + state management + ✓ Incident config created + ✓ Incident config is valid JSON + + script safety + ✓ Uses set -euo pipefail + ✓ Has source guard pattern + ✓ ERR trap is set + + config subcommand + ✓ config subcommand creates valid JSON config + + subcommand usage +ERROR: scripts/sw-incident-test.sh:203 exited with status 1 + ✓ show subcommand fails or shows usage when missing args + ✓ report subcommand shows usage when missing args + + detect_pipeline_failures + ✓ detect_pipeline_failures defined and callable + ✓ report with nonexistent incident handles gracefully + ✓ gap list subcommand produces expected output + + timeline updates + ✓ Timeline update creates valid JSON + ✓ Timeline entry has all required fields + + correlation engine + ✓ Correlation engine produces valid JSON + + escalation logic + ✓ Escalation includes P0 rules + ✓ Escalation function defined + + rollback verification + ✓ Rollback verification function defined + ✓ Rollback verification updates timeline + + auto-remediate subcommand + ✓ auto-remediate subcommand shows usage when missing args + + deep analysis + ✓ Deep analysis function defined + ✓ Deep analysis returns valid JSON with required fields + ✓ Config includes auto_remediate_enabled flag + ✓ Config includes escalation rules for all severities + + post-mortem report + ✓ Post-mortem report includes Timeline section + ✓ Post-mortem report references timeline fields + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright init — E2E Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up sandboxed environment... +Temp dir: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-init-test.8F6X3Q + +Configuration + ▸ Settings.json created with agent teams... ✓ + ▸ Settings merge preserves existing vars... ✓ + ▸ tmux.conf installed... ✓ + ▸ Overlay installed... ✓ + +Templates + ▸ Team templates installed (>= 10)... ✓ + ▸ Pipeline templates installed (>= 5)... ✓ + ▸ Legacy templates path populated... ✓ + +Robustness + ▸ Idempotency — double init safe... ✓ + ▸ Doctor runs at end... ✓ + ▸ Help flag... ✓ + +Hook Wiring + ▸ JSONC stripped from settings.json... ✓ + ▸ Hooks wired into settings.json... ✓ + ▸ Hook wiring preserves existing hooks... ✓ + ▸ SessionStart hook installed... ✓ + ▸ Hook wiring with pre-existing settings... ✓ + +Repair & Cleanup + ▸ Legacy overlay cleanup... ✓ + ▸ Legacy overlay source-file reference stripped... ✓ + ▸ Repair mode forces clean reinstall... ✓ + ▸ Plugin direct-clone fallback (outside tmux)... ✓ + ▸ Post-install verification... ✓ + ▸ tmux adapter deployed... ✓ + +════════════════════════════════════════════════════ + All 21 tests passed ✓ +════════════════════════════════════════════════════ + + + + Shipwright Instrument Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage + ✓ help shows commands + ✓ help shows start + ✓ help shows record + ✓ help shows version + + Start Command + ✓ start without --run-id exits non-zero + ✓ start without --run-id shows error + ✓ start with --run-id exits 0 + ✓ start confirms run ID + ✓ start creates run file + ✓ run file contains correct run_id + ✓ run file contains correct issue + + Record Command + ✓ record exits 0 + ✓ record confirms metric + ✓ run file has 1 metric + ✓ record without all args exits non-zero + ✓ record on missing run exits non-zero + ✓ record on missing run shows error + + Stage Start/End + ✓ stage-start exits 0 + ✓ stage-start confirms stage + ✓ stage-end exits 0 + ✓ stage result recorded + + Finish Command + ✓ finish exits 0 + ✓ finish confirms completion + ✓ finish removes active run file + ✓ finish writes to completed JSONL + + Trends & Export + ✓ trends with no data exits 0 + ✓ trends with no data warns + ✓ export with no data exits 0 + ✓ export with no data warns + + Error Handling + ✓ unknown command exits non-zero + ✓ unknown command shows error + ✓ events.jsonl created from instrument operations + + + + ────────────────────────────────────────── + + All 33 tests passed + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright intelligence test — Unit Tests ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up test environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-intelligence-test.U15awc + + ▸ analyze_issue returns valid schema... ✓ + ▸ Cache hit on second call with same input... WARN: Unknown event type 'intelligence.cache_hit' — update config/event-schema.json +✓ + ▸ Graceful degradation when claude CLI unavailable... ✓ + ▸ compose_pipeline produces valid pipeline JSON... ✓ + ▸ recommend_model returns valid model names... WARN: Unknown event type 'intelligence.model' — update config/event-schema.json +WARN: Unknown event type 'intelligence.model' — update config/event-schema.json +WARN: Unknown event type 'intelligence.model' — update config/event-schema.json +WARN: Unknown event type 'intelligence.model' — update config/event-schema.json +✓ + ▸ predict_cost returns numeric estimates... WARN: Unknown event type 'intelligence.prediction' — update config/event-schema.json +✓ + ▸ Cache TTL expiry returns miss... scripts/sw-intelligence-test.sh: line 334: _intelligence_md5: command not found +✓ + ▸ search_memory returns ranked results... ✓ + ▸ Feature flag disabled returns fallback... ✗ Expected 'intelligence_disabled', got 'null' (returns intelligence_disabled error) +✓ + ▸ Events emitted for analysis... ✓ + ▸ recommend_model emits events... WARN: Unknown event type 'intelligence.model' — update config/event-schema.json +✓ + ▸ Cache init creates file if missing... ✓ + +━━━ Results ━━━ + Passed: 12 + Failed: 0 + Total: 12 + +All 12 tests passed! + + + + Shipwright Jira Test Suite + ══════════════════════════════════════════ + + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ VERSION variable defined + ✓ VERSION is semver: 3.2.4 + + Help Output + ✓ help mentions USAGE + ✓ help mentions sync command + ✓ help mentions update command + ✓ help mentions status command + ✓ help mentions init command + ✓ help mentions JIRA_BASE_URL + ✓ --help works + + Error Handling + +shipwright jira — Jira ↔ GitHub Bidirectional Sync + +USAGE + shipwright jira [options] + +COMMANDS + sync [--dry-run] Sync Jira To Do issues → GitHub + update Update linked Jira ticket status + status Show Jira board dashboard + init Configure Jira connection + help Show this help + +STATUS VALUES + started Pipeline spawned → Jira: In Progress + review PR created → Jira: In Review + done Pipeline complete → Jira: Done + failed Pipeline failed → Jira: adds failure comment + +EXAMPLES + shipwright jira init # Set up Jira connection + shipwright jira sync # Sync To Do → GitHub + shipwright jira sync --dry-run # Preview what would sync + shipwright jira update 42 started # Mark as In Progress + shipwright jira update 42 review # Mark as In Review + shipwright jira update 42 done # Mark as Done + shipwright jira status # Show board dashboard + +ENVIRONMENT + JIRA_BASE_URL Jira instance URL (or use 'jira init' to save) + JIRA_EMAIL Account email for authentication + JIRA_API_TOKEN API token from Atlassian account + JIRA_PROJECT_KEY Jira project key (e.g. PROJ) + ✓ unknown command exits non-zero + ✓ unknown command shows error + + Default Behavior + ✓ no-arg defaults to help + + Configuration + ✓ sync without config shows error + ✓ update without config shows error + ✓ status without config shows error + + Config Loading + ✓ sync with config proceeds + + Update Subcommand + ✓ update without args shows usage + + Notify Integration + ✓ notify subcommand executes without crash + + Atomic Writes + ✓ init uses atomic write (tmp + mv) + ✓ config file gets restricted permissions + + Event Emission + ✓ emits jira events + ✓ uses EVENTS_FILE for event logging + + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright launchd + systemd — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +OS Detection + ▸ macOS detection sets OSTYPE correctly... ✓ + ▸ Linux detection routes to systemd... ✓ + +macOS Plist Generation + ▸ Daemon plist has correct structure... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Dashboard plist has correct arguments... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Connect plist created only when team-config exists... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/Library/LaunchAgents/com.shipwright.dashboard.plist +▸ Skipping connect plist — /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/.shipwright/team-config.json not found +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/fresh-home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Plist files have correct permissions... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + +Install Command + ▸ Install creates LaunchAgents directory... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Install calls launchctl load... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +scripts/sw-launchd-test.sh: line 457: [[: 0 +0: syntax error in expression (error token is "0") +✓ + ▸ Install fails gracefully if sw binary not found... ✓ + +Uninstall Command + ▸ Uninstall removes plist files... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +▸ Uninstalling launchd agents... +Mock unloaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Unloaded daemon service +✓ Removed daemon plist +Mock unloaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Unloaded dashboard service +✓ Removed dashboard plist +Mock unloaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Unloaded connect service +✓ Removed connect plist + +✓ Uninstalled all launchd agents +✓ + ▸ Uninstall calls launchctl unload... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +▸ Uninstalling launchd agents... +Mock unloaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Unloaded daemon service +✓ Removed daemon plist +Mock unloaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Unloaded dashboard service +✓ Removed dashboard plist +Mock unloaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Unloaded connect service +✓ Removed connect plist + +✓ Uninstalled all launchd agents +scripts/sw-launchd-test.sh: line 549: [[: 0 +0: syntax error in expression (error token is "0") +✓ + ▸ Uninstall on empty system doesn't error... ▸ Uninstalling launchd agents... + +✓ Uninstalled all launchd agents +✓ + +Status Command + ▸ Status command checks launchctl list... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Status shows log directory... ✓ + +Help Command + ▸ Help command shows usage and examples... ✓ + ▸ Help is shown for unknown commands... ✓ + +Environment & Configuration + ▸ Plist contains correct environment variables... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Working directory is set in plist... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Plist configures stdout and stderr logging... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.TQ0xa0/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.TQ0xa0/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Version variable is defined... ✓ + +════════════════════════════════════════════════════ + All 20 tests passed ✓ +════════════════════════════════════════════════════ + + + + Shipwright Linear Test Suite + ══════════════════════════════════════════ + + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ VERSION variable defined + ✓ VERSION is semver: 3.2.4 + + Help Output + ✓ help mentions USAGE + ✓ help mentions sync command + ✓ help mentions update command + ✓ help mentions status command + ✓ help mentions init command + ✓ help mentions LINEAR_API_KEY + ✓ --help works + + Error Handling + +shipwright linear — Linear ↔ GitHub Bidirectional Sync + +USAGE + shipwright linear [options] + +COMMANDS + sync [--dry-run] Sync Linear Todo issues → GitHub + update Update linked Linear ticket status + status Show sync dashboard + init Configure Linear API key + help Show this help + +STATUS VALUES + started Pipeline spawned → Linear: In Progress + review PR created → Linear: In Review + done Pipeline complete → Linear: Done + failed Pipeline failed → Linear: adds failure comment + +EXAMPLES + shipwright linear init # Set up API key + shipwright linear sync # Sync Todo → GitHub + shipwright linear sync --dry-run # Preview what would sync + shipwright linear update 42 started # Mark as In Progress + shipwright linear update 42 review # Mark as In Review + shipwright linear update 42 done # Mark as Done + shipwright linear status # Show dashboard + +ENVIRONMENT + LINEAR_API_KEY API key (or use 'linear init' to save) + LINEAR_TEAM_ID Override team ID + LINEAR_PROJECT_ID Override project ID + ✓ unknown command exits non-zero + ✓ unknown command shows error + + Default Behavior + ✓ no-arg defaults to help + + Configuration + ✓ sync without API key shows error + ✓ update without API key shows error + ✓ status without API key shows error + + Config Loading + ✓ sync with config proceeds + + Update Subcommand + ✓ update without args shows usage + + Notify Integration + ✓ notify subcommand executes without crash + + Atomic Writes + ✓ init uses atomic write (tmp + mv) + ✓ config file gets restricted permissions + + GraphQL Helper + ✓ linear_graphql helper defined + ✓ uses jq --arg for safe JSON escaping + + Event Emission + ✓ emits linear events + ✓ uses EVENTS_FILE for event logging + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + shipwright logs test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + +Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + +Version + ✓ VERSION variable defined + +Help Output + ✓ help contains USAGE + ✓ help contains --pane option + ✓ help contains --follow option + ✓ help contains --grep option + ✓ help contains --capture option + ✓ help contains -f shorthand + +Help Exit Code + ✓ help exits 0 + ✓ -h exits 0 + +Error Handling + ✓ unknown option exits non-zero + ✓ unknown option error + +List Logs + ✓ list shows Agent Logs heading + ✓ list shows team directory + +Team Logs + ✓ team logs shows team name + ✓ team logs lists log files + ✓ team logs shows builder log + +Grep Search + ✓ grep finds ERROR pattern + ✓ grep shows file context + ✓ grep shows no matches warning + +Pane Filter + ✓ pane filter shows reviewer logs + ✓ nonexistent pane warns + +Capture Command + ✓ capture reports status + +Missing Arguments + ✓ missing --pane value exits non-zero + ✓ missing --pane shows error + ✓ missing --grep value exits non-zero + ✓ missing --grep shows error + +Intelligence Integration + ✓ intelligence_available function defined + ✓ semantic_rank_results function defined + +Script Structure + ✓ LOGS_DIR defined + ✓ capture_logs function defined + ✓ list_logs function defined + ✓ show_team_logs function defined + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Loop Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help / version + ✓ --help exits 0 + ✓ --help shows usage + ✓ --help shows options + ✓ --help mentions --max-iterations + ✓ --help mentions --test-cmd + ✓ --help mentions --model + ✓ --help mentions --agents + ✓ --help mentions --resume + ✓ VERSION variable defined in sw-loop.sh + + argument parsing + ✓ No arguments exits non-zero + + script safety + ✓ Uses set -euo pipefail + ✓ ERR trap is set + ✓ SIGHUP trap set for daemon resilience + ✓ CLAUDECODE env var is unset + + defaults + ✓ Default MAX_ITERATIONS is 20 + ✓ Default AGENTS is 1 + ✓ Default MAX_RESTARTS is 0 + ✓ Sources lib/compat.sh + + json output format + ✓ build_claude_flags includes --output-format json + effort level flag + ✓ build_claude_flags supports --effort + fallback model flag + ✓ build_claude_flags supports --fallback-model + ✓ accumulate_loop_tokens parses JSON usage + ✓ LOOP_COST_MILLICENTS initialized + ✓ write_loop_tokens includes cost_usd + ✓ _extract_text_from_json helper defined + ✓ validate_claude_output helper defined + ✓ check_budget_gate helper defined + ✓ run_claude_iteration separates stdout from stderr + + json extraction robustness + ✓ _extract_text_from_json handles empty file + ✓ _extract_text_from_json extracts .result from JSON + ✓ _extract_text_from_json passes through plain text + + default config from source + ✓ Default MAX_ITERATIONS is 20 (from source) + ✓ Default AGENTS is 1 (from source) + ✓ Default MAX_RESTARTS is 0 (from source) + + json extraction edge cases + ✓ _extract_text_from_json handles nested JSON objects + ✓ _extract_text_from_json handles binary garbage without crash + + script structure + ✓ Script has circuit breaker logic + ✓ Script has stuckness detection + ✓ Script has test/quality gate functions + ✓ Help text defines --model and --agents flags + ✓ Help text defines --test-cmd and --resume flags + help mentions --effort + ✓ Help text defines --effort flag + help mentions --fallback-model + ✓ Help text defines --fallback-model flag + + loop behavior: LOOP_COMPLETE + ✓ Loop detected completion signal + + loop behavior: iterations on test failure + ✓ Loop runs multiple iterations when tests fail initially + + loop behavior: max iterations + ✓ Loop stops at max iterations + + loop behavior: stuckness detection + ✓ Loop stops at limit (stuckness test) + + loop behavior: budget gate + ✓ Budget gate stops loop + + validate_claude_output + ✓ validate_claude_output catches corrupt output + + loop behavior: progress tracking + ✓ Loop tracks progress via git + + context efficiency metrics + ✓ loop.context_efficiency event exists in run_claude_iteration + ✓ Context efficiency emits raw and trimmed char counts + ✓ Context efficiency emits trim_ratio and budget_utilization + ✓ raw_prompt_chars measured from pre-trim prompt + + multi-test gate + ✓ ADDITIONAL_TEST_CMDS variable defined + ✓ --additional-test-cmds flag in arg parser + ✓ --help documents --additional-test-cmds + ✓ run_test_gate writes test-evidence JSON + ✓ run_audit_agent reads structured test evidence + + verification gap handler + ✓ Verification gap detection present + ✓ Verification gap resolved event emitted + ✓ Verification gap confirmed event emitted + ✓ Verification gap can override audit result + ✓ Verification re-runs tests to dedicated log + ✓ Mid-build test file discovery integrated + + + + ────────────────────────────────────────── + + All 65 tests passed + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright memory+cost test — Unit Tests for Memory & Cost ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up mock environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-memory-test.wHjjN6 + + ▸ Memory capture from pipeline state... ✓ + ▸ Memory inject returns context for each stage... ✓ + ▸ Failure capture stores patterns... ✓ + ▸ Pattern detection identifies project type... ✓ + ▸ Cross-repo vs per-repo isolation... ✓ + ▸ Memory show displays dashboard... ✓ + ▸ Memory search finds matching entries... ✓ + ▸ Memory export produces valid JSON... ✓ + ▸ Memory forget clears repo memory... ✓ + ▸ Cost calculation for each model... ✓ + ▸ Cost recording writes to costs.json... ✓ + ▸ Budget set and check... ✓ + ▸ Cost dashboard runs without errors... ✓ + ▸ Cost JSON output is valid... ✓ + ▸ Actionable failures threshold filtering... ✓ + ▸ Actionable failures with no file returns []... ✓ + ▸ DORA baseline calculation from events... ✓ + ▸ Error log entries captured into failures.json... ✓ + ▸ Fix outcome tracking increments counters... ✓ + ▸ Closed-loop inject returns formatted fix... ✓ + ▸ Global aggregation promotes frequent patterns... ✓ + ▸ Finalize pipeline runs capture + aggregate... ▸ Capturing pipeline learnings for test-org/test-repo... +✓ Captured pipeline learnings (status: complete) +✓ + +━━━ Results ━━━ + Passed: 22 + Failed: 0 + Total: 22 + +All 22 tests passed! + + + + Shipwright Mission Control Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Navigation + ✓ help shows usage + ✓ help shows commands + ✓ --help flag works + ✓ unknown command shows error + Overview + ✓ overview shows MISSION CONTROL header + ✓ overview shows Summary Statistics + ✓ overview shows Active Pipelines + ✓ overview with active job shows count + Agent Tree + ✓ agents shows hierarchy + ✓ agents shows Pipeline Agent + Resources + ✓ resources shows utilization + Alerts + ✓ alerts shows alert feed + Stage Commands + ✓ pause without id exits nonzero + ✓ pause emits success + ✓ resume without id exits nonzero + ✓ resume emits success + ✓ skip without stage shows usage + ✓ skip emits success + ✓ retry emits success + + + + ────────────────────────────────────────── + + All 19 tests passed + + + + Shipwright Model Router Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage + ✓ help shows route + ✓ help shows escalate + ✓ help shows config + + Route Model + ✓ route intake at 50 = haiku + ✓ route build at 50 = opus + ✓ route test at 50 = sonnet + ✓ route build at 10 (low) = sonnet + ✓ route intake at 90 (high) = opus + ✓ route unknown stage at 50 = sonnet + + Escalate Model + ✓ escalate haiku -> sonnet + ✓ escalate sonnet -> opus + ✓ escalate opus -> opus (ceiling) + ✓ escalate unknown exits non-zero + + Config + ✓ config show displays JSON + ✓ config creates default file + ✓ config set confirms update + ✓ config set persists value + + Estimate + ✓ estimate shows stages + ✓ estimate shows total + + Report + ✓ report with no data warns + + Record Usage + ✓ record_usage creates usage file + ✓ record_usage writes entries + ✓ report with data shows summary + ✓ report shows total runs + ✓ report shows cost + ✓ report shows model counts + + Route All Stages & Complexity + ✓ route intake at 50 returns model + ✓ route plan at 50 returns model + ✓ route design at 50 returns model + ✓ route build at 50 returns model + ✓ route test at 50 returns model + ✓ route review at 50 returns model + ✓ route compound_quality at 50 returns model + ✓ route validate at 50 returns model + ✓ route monitor at 50 returns model + ✓ route plan at low complexity = sonnet + ✓ route plan at high complexity = opus + + Config Set/Show Cycle +✓ Updated cost_aware_mode = false + ✓ config show reflects settings + ✓ config set persists + ✓ estimate with low complexity shows stages + ✓ estimate shows Total + ✓ estimate with high complexity + + Error Handling + ✓ unknown subcommand exits non-zero + ✓ unknown subcommand shows error + + Reasoning Chains + ✓ chain config shows templates + ✓ chain config shows explore-synthesize-decide + ✓ chain config shows fast-verify + ✓ chain config shows deep-analysis + ✓ chain config creates templates file + ✓ chain define shows success + ✓ chain define persists custom chain + ✓ chain_score_confidence returns numeric score + ✓ chain_score_confidence scores conclusion text higher + ✓ chain execute returns valid JSON + ✓ chain execute result has steps + ✓ chain step-cost returns numeric cost for haiku + ✓ chain step-cost ordering correct (haiku < sonnet < opus) + ✓ chain report outputs summary + ✓ chain define with invalid JSON exits non-zero + ✓ chain define validates JSON + ✓ explore-decide has 2 steps + ✓ explore-decide first step is haiku + ✓ explore-decide last step is opus + ✓ chain execute with invalid chain exits non-zero + ✓ chain execute shows error + + + + ────────────────────────────────────────── + + All 66 tests passed + + + + Shipwright OTel Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage + ✓ help shows metrics + ✓ help shows trace + ✓ help shows export + ✓ help shows webhook + ✓ help shows dashboard + + Metrics + ✓ metrics text has pipelines_total + ✓ metrics text has active_pipelines + ✓ metrics text has cost + ✓ metrics text has queue depth + ✓ metrics json has metrics key + ✓ metrics json has pipelines_total + ✓ metrics json is valid JSON + + Trace + ✓ trace has resourceSpans + ✓ trace has service.name + ✓ trace output is valid JSON + + Dashboard + ✓ dashboard has dashboard key + ✓ dashboard output is valid JSON + + Report + ✓ report shows header + ✓ report shows events section + ✓ report shows pipeline metrics + ✓ report shows recommendations + + Webhook + ✓ webhook without URL exits non-zero + ✓ webhook without URL shows error + + Metrics With Events + ✓ metrics count total pipelines = 2 + + Error Handling + ✓ unknown command exits non-zero + ✓ unknown command shows error + + + + ────────────────────────────────────────── + + All 27 tests passed + + + + Shipwright Oversight Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ unknown command exits 1 + ✓ members shows board + ✓ members.json created + ✓ members.json is valid JSON + ✓ config show works + ✓ config.json created + ✓ stats shows statistics + ✓ history handles empty + ✓ review without args exits nonzero + ✓ review submission accepted + ✓ review JSON file created + ✓ review file is valid JSON + ✓ stats shows total reviews >= 1 + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Patrol Meta Test Suite + ══════════════════════════════════════════ + + + Script Structure + ✓ contains sourced-file note + ✓ does NOT have set -euo pipefail as code (sourced script) + ✓ no main() function (sourced script) + + Function Definitions + ✓ patrol_meta_run() defined + ✓ patrol_meta_create_issue() defined + ✓ patrol_meta_untested_scripts() defined + ✓ patrol_meta_bash_compat() defined + ✓ patrol_meta_version_sync() defined + ✓ patrol_meta_dora_trends() defined + ✓ patrol_meta_template_effectiveness() defined + ✓ patrol_meta_memory_pruning() defined + ✓ patrol_meta_event_analysis() defined + + Sourcing + ✓ script can be sourced without error + + NO_GITHUB Dry Run + ✓ checks NO_GITHUB for dry-run mode + ✓ supports dry-run output + + Bash Compat Checks + ✓ bash compat check looks for declare -A + ✓ bash compat check looks for readarray/mapfile + + Dedup Logic + ✓ dedup logic skips duplicate issues + ✓ dedup searches existing issues + + Memory Pruning Check + ✓ memory pruning uses du -sk for size check + ✓ memory pruning has MB threshold + + Event Analysis + ✓ event analysis uses 7-day window + ✓ event analysis checks pipeline.completed events + + Closed-Loop Self-Improvement Functions + ✓ patrol_meta_score_impact() defined + ✓ patrol_meta_execute_fix() defined + ✓ patrol_meta_verify_fix() defined + ✓ patrol_meta_rollback() defined + ✓ patrol_meta_learn() defined + ✓ patrol_meta_batch_improve() defined + ✓ patrol_meta_auto() defined + + Impact Scoring + ✓ untested-scripts category gets high score (75) + ✓ dora-regression category gets critical score (85) + + Pipeline Integration + ✓ execute_fix calls shipwright pipeline start + ✓ uses --worktree for isolated execution + + Learning & Memory + ✓ records self-improvements in memory + ✓ emits patrol.meta_learned events + + Batch Processing + ✓ batch_improve looks for meta-improvement labeled issues + ✓ sorts issues by impact score (highest first) + + Dry-Run Mode + ✓ new functions respect NO_GITHUB for dry-run + + Autonomous Loop + ✓ patrol_meta_auto() orchestrates full loop + ✓ auto loop has stage 1 (detection) and stage 2 (batch) + + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright pipeline composer — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Pipeline Composition + ▸ Composed pipeline has valid stage ordering... ✓ + ▸ High-risk issue gets security stages... .claude/pipeline-artifacts/composed-pipeline.json +✓ + ▸ Fallback to static template when no intelligence... ✓ + +Conditional Stage Insertion + ▸ Stage inserted at correct position after build... ✓ + ▸ Insert into nonexistent stage fails... ✓ + +Model Downgrade + ▸ Budget constraint triggers model downgrades... ✓ + ▸ Downgrade with nonexistent stage fails... ✓ + +Pipeline Validation + ▸ Validation accepts valid pipeline... ✓ + ▸ Validation rejects invalid ordering (test before build)... ✓ + ▸ Validation rejects missing stage ids... ✓ + ▸ Validation rejects missing stages array... ✓ + +Iteration Estimation + ▸ Iteration estimates are reasonable (1-50 range)... ✓ + +════════════════════════════════════════════════════ + All 12 tests passed ✓ +════════════════════════════════════════════════════ + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright pipeline test — E2E Validation (Real Subprocess) ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up mock environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-pipeline-test.VrBGqF + + ▸ Preflight passes with all mocks... ✓ + ▸ Preflight fails when sw-loop.sh missing... ✓ + ▸ Start requires --goal or --issue... ✓ + ▸ Intake with --goal creates branch + artifacts... ✓ + ▸ Intake with --issue fetches from GitHub... ✓ + ▸ Plan generates plan.md, dod.md, tasks... ✗ File not found: .claude/pipeline-artifacts/plan.md (plan generated) +✗ FAILED + ▸ Build invokes sw loop and commits... ✓ + ▸ Test stage captures results to log... ✓ + ▸ Review generates report with severities... ✗ File not found: .claude/pipeline-artifacts/review.md (review generated) +✗ FAILED + ▸ PR stage creates PR URL artifact... ✓ + ▸ Full E2E pipeline (6 stages)... ✗ File not found: .claude/pipeline-artifacts/plan.md (plan artifact) +✗ FAILED + ▸ Resume continues from partial state... ✗ File not found: .claude/pipeline-artifacts/plan.md (plan generated after resume) +✗ FAILED + ▸ Abort marks pipeline as aborted... ✓ + ▸ Dry run shows config, no artifacts... ✗ Output missing pattern: Pipeline.*standard (shows pipeline name) + Output (last 5 lines): + Estimated Cost: $0.4200 USD + + ✓ Dry run validation passed + + To execute this pipeline: remove --dry-run flag +✓ + ▸ Self-healing build→test retry loop... ✓ + ▸ Intelligence: Skip stages for documentation issues... ✓ + ▸ Intelligence: Skip stages for low complexity... ✗ Output missing pattern: intelligence.*complexity.*[0-3]|stage.*skipped (should show intelligence skip due to complexity) + Output (last 5 lines): + + jq: parse error: Invalid numeric literal at line 1, column 2 + jq: parse error: Invalid numeric literal at line 1, column 2 + jq: parse error: Invalid numeric literal at line 1, column 2 + jq: parse error: Invalid numeric literal at line 1, column 2 +✗ FAILED + ▸ Intelligence: Finding classification and routing... ✓ + ▸ Intelligence: Mid-pipeline complexity reassessment... ✓ + ▸ Intelligence: Backtracking limit (1 per pipeline)... ✓ + ▸ Cleanup: Post-completion clears checkpoints and transient artifacts... ✗ Expected exit code 0, got 1 (pipeline should complete) +✗ FAILED + ▸ Cleanup: pipeline_cancel_check_runs function exists... ✓ + ▸ Vitals: sw-pipeline-vitals.sh exists and is syntactically valid... ✓ + ▸ Vitals: All vitals functions defined in module... ✓ + ▸ Vitals: Health verdict maps scores correctly... ✓ + ▸ Vitals: Adaptive limit returns valid integer... ✓ + ▸ Vitals: Budget trajectory returns ok/warn/stop... ✓ + ▸ Quality: pipeline_select_audits function exists... ✓ + ▸ Quality: pipeline_security_source_scan function exists... ✓ + ▸ Quality: pipeline_verify_dod function exists... ✓ + ▸ Quality: pipeline_record_quality_score function exists... ✓ + ▸ Quality: Templates have compound_quality_blocking... ✓ + ▸ Vitals: Progress snapshot writes correct file... ✓ + ▸ Vitals: Momentum score from snapshot history... ✓ + ▸ Vitals: Convergence with decreasing errors... ✓ + ▸ Vitals: Configurable weights via env vars... ✓ + ▸ Vitals: Budget trajectory warn/stop on exhaustion... ✓ + ▸ Quality: Structured findings JSON is valid... ✓ + ▸ Quality: Multi-backtrack counter tracking... ✓ + ▸ Quality: 6 categories in classify_quality_findings... ✓ + ▸ Deploy: Pre-deploy gates exist in pipeline... ✓ + ▸ Deploy: Deploy strategy config pattern... ✓ + ▸ Deploy: Canary deploy flow patterns exist... ✓ + ▸ Pipeline: PIPELINE_STATE references removed... ✓ + ▸ Pipeline: Coverage JSON creation in test stage... ✓ + ▸ Pipeline: _pipeline_compact_goal returns goal+plan+design... ✓ + ▸ Pipeline: load_composed_pipeline sets COMPOSED_STAGES... ✓ + ▸ Vitals: Momentum returns 60 for single snapshot past intake... ✓ + ▸ Vitals: Health gate blocks when health < threshold... ✓ + ▸ Vitals: Health gate passes with default threshold=40... ✓ + ▸ Durable: persist_artifacts function exists... ✓ + ▸ Durable: persist_artifacts skips in non-CI mode... ✓ + ▸ Durable: verify_stage_artifacts passes when artifacts present... ✓ + ▸ Durable: verify_stage_artifacts fails when artifacts missing... ✓ + ▸ Durable: verify_stage_artifacts fails when artifacts empty... ✓ + ▸ Durable: verify_stage_artifacts passes for stages with no requirements... ✓ + ▸ Durable: verify_stage_artifacts design requires plan.md... ✓ + ▸ Durable: mark_stage_complete wires persist for plan stage... ✓ + +━━━ Results ━━━ + Passed: 52 + Failed: 6 + Total: 58 + +Failed tests: + ✗ Plan generates plan.md, dod.md, tasks + ✗ Review generates report with severities + ✗ Full E2E pipeline (6 stages) + ✗ Resume continues from partial state + ✗ Intelligence: Skip stages for low complexity + ✗ Cleanup: Post-completion clears checkpoints and transient artifacts + diff --git a/.claude/platform-hygiene.json b/.claude/platform-hygiene.json index 6ba6c3934..c480b9215 100644 --- a/.claude/platform-hygiene.json +++ b/.claude/platform-hygiene.json @@ -1,6 +1,6 @@ { - "timestamp": "2026-03-08T15:58:10Z", - "repository": "shipwright", + "timestamp": "2026-03-09T10:59:38Z", + "repository": "pipeline-issue-122", "counts": { "hardcoded": 44, "fallback": 67, @@ -159,7 +159,7 @@ }, { "file": "scripts/sw-evidence.sh", - "line": 951 + "line": 996 }, { "file": "scripts/sw-pipeline-composer.sh", @@ -335,11 +335,11 @@ }, { "file": "scripts/lib/pipeline-execution.sh", - "line": 218 + "line": 226 }, { "file": "scripts/lib/recruit-commands.sh", - "line": 233 + "line": 241 }, { "file": "scripts/lib/recruit-commands.sh", @@ -383,7 +383,7 @@ }, { "file": "scripts/lib/pipeline-stages-build.sh", - "line": 1007 + "line": 1008 }, { "file": "scripts/lib/convergence.sh", @@ -435,11 +435,11 @@ }, { "file": "scripts/lib/pipeline-stages-intake.sh", - "line": 26 + "line": 29 }, { "file": "scripts/lib/compat.sh", - "line": 430 + "line": 433 }, { "file": "scripts/lib/pipeline-quality-checks.sh", @@ -616,12 +616,12 @@ "lines": 1985 }, { - "script": "sw-db.sh", - "lines": 1939 + "script": "sw-pipeline-test.sh", + "lines": 1959 }, { - "script": "sw-pipeline-test.sh", - "lines": 1914 + "script": "sw-db.sh", + "lines": 1939 }, { "script": "sw-self-optimize.sh", From 2ff2d4b7a5ca4cee89df9714f7ac004492f8466b Mon Sep 17 00:00:00 2001 From: Seth Ford Date: Mon, 9 Mar 2026 07:20:26 -0400 Subject: [PATCH 5/8] fix: isolate pipeline tests from parent env vars The pipeline test suite was failing when run inside a pipeline (e.g., via `shipwright loop`) because parent environment variables like PIPELINE_TEMPLATE, ARTIFACTS_DIR, and SCRIPT_DIR leaked into the test subprocess, causing it to load the wrong template and skip stages. Fixes: - Clear pipeline-related env vars in invoke_pipeline() before running the subprocess - Handle --effort, --fallback-model, --dangerously-skip-permissions flags in mock claude CLI to prevent prompt corruption Co-Authored-By: Claude Opus 4.6 --- scripts/sw-pipeline-test.sh | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/scripts/sw-pipeline-test.sh b/scripts/sw-pipeline-test.sh index c69428374..a0aa157c2 100755 --- a/scripts/sw-pipeline-test.sh +++ b/scripts/sw-pipeline-test.sh @@ -136,9 +136,8 @@ create_mock_claude() { prompt="" while [[ $# -gt 0 ]]; do case "$1" in - --print|--output-format) shift ;; - --model|--max-turns) shift 2 ;; - -p) shift 2 ;; + --print|--output-format|--dangerously-skip-permissions) shift ;; + --model|--max-turns|--effort|--fallback-model|-p) shift 2 ;; *) prompt="$1"; shift ;; esac done @@ -395,8 +394,13 @@ invoke_pipeline() { PIPELINE_EXIT=0 # Invoke the REAL pipeline script as a subprocess + # Clear parent pipeline env vars to prevent leaking into test subprocess PIPELINE_OUTPUT=$( cd "$TEST_TEMP_DIR/project" + unset PIPELINE_TEMPLATE PIPELINE_FALLBACK_MODEL PIPELINE_JOB_ID + unset ARTIFACTS_DIR SCRIPT_DIR SW_LOOP_STATUS SW_LOOP_GOAL + unset SW_LOOP_TEST_OUTPUT SW_LOOP_FINDINGS SW_LOOP_MODIFIED + unset SHIPWRIGHT_PIPELINE_ID PATH="$TEST_TEMP_DIR/bin:$PATH" \ bash "$TEST_TEMP_DIR/scripts/sw-pipeline.sh" "$subcommand" "$@" 2>&1 ) || PIPELINE_EXIT=$? From 08f08ec5cd6ba375c512f06909032f8a10974819 Mon Sep 17 00:00:00 2001 From: Seth Ford Date: Mon, 9 Mar 2026 07:20:30 -0400 Subject: [PATCH 6/8] =?UTF-8?q?loop:=20iteration=202=20=E2=80=94=20autonom?= =?UTF-8?q?ous=20progress?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .claude/loop-logs/iteration-2.log | 2 +- .claude/loop-state.md | 10 ++++++---- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/.claude/loop-logs/iteration-2.log b/.claude/loop-logs/iteration-2.log index e7872772d..deab01f95 100644 --- a/.claude/loop-logs/iteration-2.log +++ b/.claude/loop-logs/iteration-2.log @@ -1 +1 @@ -Invalid API key · Fix external API key +{"type":"result","subtype":"success","is_error":false,"duration_ms":531817,"duration_api_ms":338403,"num_turns":52,"result":"All 58 pipeline tests and 19 smoke tests pass. The root cause was that parent pipeline environment variables (`PIPELINE_TEMPLATE=fast`, `ARTIFACTS_DIR`, `SCRIPT_DIR`, etc.) leaked into the test subprocess, causing it to load the wrong template and skip the plan/review stages.\n\nLOOP_COMPLETE","stop_reason":"end_turn","session_id":"264e0e4e-4a83-471b-ba74-817cddd4e9fd","total_cost_usd":2.932504249999999,"usage":{"input_tokens":54,"cache_creation_input_tokens":91271,"cache_read_input_tokens":4004087,"output_tokens":13535,"server_tool_use":{"web_search_requests":0,"web_fetch_requests":0},"service_tier":"standard","cache_creation":{"ephemeral_1h_input_tokens":91271,"ephemeral_5m_input_tokens":0},"inference_geo":"","iterations":[],"speed":"standard"},"modelUsage":{"claude-opus-4-6":{"inputTokens":54,"outputTokens":13535,"cacheReadInputTokens":4004087,"cacheCreationInputTokens":91271,"webSearchRequests":0,"costUSD":2.91113225,"contextWindow":200000,"maxOutputTokens":32000},"claude-haiku-4-5-20251001":{"inputTokens":17952,"outputTokens":684,"cacheReadInputTokens":0,"cacheCreationInputTokens":0,"webSearchRequests":0,"costUSD":0.021372,"contextWindow":200000,"maxOutputTokens":32000}},"permission_denials":[],"fast_mode_state":"off","uuid":"7f9062f5-4a88-4801-8a6a-91a29d660ef7"} diff --git a/.claude/loop-state.md b/.claude/loop-state.md index 0750ebb95..ab354f1b6 100644 --- a/.claude/loop-state.md +++ b/.claude/loop-state.md @@ -53,16 +53,16 @@ Discoveries from other pipelines: [design] Design completed for Build a production-grade todo application. TypeScript + React frontend with Vite, Express REST API backend, SQLite persistence with Drizzle ORM, JWT authentication (register/login), full CRUD for todos with filtering (all/active/completed), drag-and-drop reorder, due dates, priorities (low/medium/high), dark mode, responsive design. Include comprehensive test suite (unit + integration + e2e). Production-ready: error handling, input validation, rate limiting, CORS, environment config. — Resolution: [intake] Stage intake completed — Resolution: [intake] Stage intake completed — Resolution: " -iteration: 0 +iteration: 1 max_iterations: 10 status: running test_cmd: "npm test" model: opus agents: 1 -started_at: 2026-03-09T10:45:28Z -last_iteration_at: 2026-03-09T10:45:28Z +started_at: 2026-03-09T11:11:31Z +last_iteration_at: 2026-03-09T11:11:31Z consecutive_failures: 0 -total_commits: 0 +total_commits: 1 audit_enabled: false audit_agent_enabled: false quality_gates_enabled: false @@ -73,4 +73,6 @@ max_extensions: 3 --- ## Log +### Iteration 1 (2026-03-09T11:11:31Z) +{"type":"result","subtype":"success","is_error":false,"duration_ms":529352,"duration_api_ms":415316,"num_turns":65,"resu From 2028b33f93a79f749ebaa3b0d50e62b4719b8aef Mon Sep 17 00:00:00 2001 From: Seth Ford Date: Mon, 9 Mar 2026 07:52:52 -0400 Subject: [PATCH 7/8] =?UTF-8?q?loop:=20iteration=202=20=E2=80=94=20post-au?= =?UTF-8?q?dit=20cleanup?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .claude/daemon-config.json | 2 +- .claude/intelligence-cache.json | 4 +- .claude/loop-logs/tests-iter-2.log | 8534 ++++++++++++++++++++++++++-- .claude/platform-hygiene.json | 6 +- 4 files changed, 8052 insertions(+), 494 deletions(-) diff --git a/.claude/daemon-config.json b/.claude/daemon-config.json index 6283cbd5d..6acee92ae 100644 --- a/.claude/daemon-config.json +++ b/.claude/daemon-config.json @@ -16,7 +16,7 @@ "pipeline_template": "autonomous", "auto_template": false, "last_optimization": { - "timestamp": "2026-03-08T15:49:14Z", + "timestamp": "2026-03-09T11:50:49Z", "adjustments": "compound_quality enabled (CFR 25% > 20%); merge stage recommended (deploy freq 3.0/week); " } } diff --git a/.claude/intelligence-cache.json b/.claude/intelligence-cache.json index c25fe39a6..70259e98b 100644 --- a/.claude/intelligence-cache.json +++ b/.claude/intelligence-cache.json @@ -32,7 +32,7 @@ "result": { "raw_response": "Mock claude response" }, - "timestamp": 1772984580, + "timestamp": 1773056069, "ttl": 3600 }, "efd5524ad1244fd20344bf3ff5d95233": { @@ -263,7 +263,7 @@ } } ], - "timestamp": 1773054569, + "timestamp": 1773056933, "ttl": 1800 }, "fdbb7bc340414cf938cecd4b790fa9e2": { diff --git a/.claude/loop-logs/tests-iter-2.log b/.claude/loop-logs/tests-iter-2.log index 176f067c3..4ad1a6d46 100644 --- a/.claude/loop-logs/tests-iter-2.log +++ b/.claude/loop-logs/tests-iter-2.log @@ -1,615 +1,8173 @@ -> shipwright-cli@1.7.1 test -> bash scripts/sw-pipeline-test.sh && bash scripts/sw-daemon-test.sh && bash scripts/sw-prep-test.sh && bash scripts/sw-fleet-test.sh && bash scripts/sw-fix-test.sh && bash scripts/sw-memory-test.sh && bash scripts/sw-session-test.sh && bash scripts/sw-init-test.sh && bash scripts/sw-tracker-test.sh && bash scripts/sw-heartbeat-test.sh && bash scripts/sw-remote-test.sh && bash scripts/sw-intelligence-test.sh && bash scripts/sw-pipeline-composer-test.sh && bash scripts/sw-self-optimize-test.sh && bash scripts/sw-predictive-test.sh && bash scripts/sw-frontier-test.sh && bash scripts/sw-connect-test.sh +> shipwright-cli@3.2.4 test +> bash scripts/sw-agi-roadmap-test.sh && bash scripts/sw-activity-test.sh && bash scripts/sw-adaptive-test.sh && bash scripts/sw-adversarial-test.sh && bash scripts/sw-architecture-enforcer-test.sh && bash scripts/sw-auth-test.sh && bash scripts/sw-autonomous-test.sh && bash scripts/sw-changelog-test.sh && bash scripts/sw-checkpoint-test.sh && bash scripts/sw-ci-test.sh && bash scripts/sw-cleanup-test.sh && bash scripts/sw-code-review-test.sh && bash scripts/sw-connect-test.sh && bash scripts/sw-context-budget-test.sh && bash scripts/sw-context-test.sh && bash scripts/sw-cost-test.sh && bash scripts/sw-daemon-test.sh && bash scripts/sw-dashboard-test.sh && bash scripts/sw-db-test.sh && bash scripts/sw-decompose-test.sh && bash scripts/sw-decide-test.sh && bash scripts/sw-deps-test.sh && bash scripts/sw-developer-simulation-test.sh && bash scripts/sw-discovery-test.sh && bash scripts/sw-doc-fleet-test.sh && bash scripts/sw-docs-agent-test.sh && bash scripts/sw-docs-test.sh && bash scripts/sw-doctor-test.sh && bash scripts/sw-dora-test.sh && bash scripts/sw-durable-test.sh && bash scripts/sw-e2e-orchestrator-test.sh && bash scripts/sw-eventbus-test.sh && bash scripts/sw-feedback-test.sh && bash scripts/sw-fix-test.sh && bash scripts/sw-fleet-discover-test.sh && bash scripts/sw-fleet-test.sh && bash scripts/sw-fleet-viz-test.sh && bash scripts/sw-frontier-test.sh && bash scripts/sw-github-app-test.sh && bash scripts/sw-github-checks-test.sh && bash scripts/sw-github-deploy-test.sh && bash scripts/sw-github-graphql-test.sh && bash scripts/sw-guild-test.sh && bash scripts/sw-heartbeat-test.sh && bash scripts/sw-hello-test.sh && bash scripts/sw-hygiene-test.sh && bash scripts/sw-incident-test.sh && bash scripts/sw-init-test.sh && bash scripts/sw-instrument-test.sh && bash scripts/sw-intelligence-test.sh && bash scripts/sw-jira-test.sh && bash scripts/sw-launchd-test.sh && bash scripts/sw-linear-test.sh && bash scripts/sw-logs-test.sh && bash scripts/sw-loop-test.sh && bash scripts/sw-memory-test.sh && bash scripts/sw-mission-control-test.sh && bash scripts/sw-model-router-test.sh && bash scripts/sw-otel-test.sh && bash scripts/sw-oversight-test.sh && bash scripts/sw-patrol-meta-test.sh && bash scripts/sw-pipeline-composer-test.sh && bash scripts/sw-pipeline-test.sh && bash scripts/sw-pipeline-vitals-test.sh && bash scripts/sw-pm-test.sh && bash scripts/sw-pr-lifecycle-test.sh && bash scripts/sw-predictive-test.sh && bash scripts/sw-prep-test.sh && bash scripts/sw-ps-test.sh && bash scripts/sw-public-dashboard-test.sh && bash scripts/sw-quality-test.sh && bash scripts/sw-reaper-test.sh && bash scripts/sw-recruit-test.sh && bash scripts/sw-regression-test.sh && bash scripts/sw-release-manager-test.sh && bash scripts/sw-release-test.sh && bash scripts/sw-root-cause-test.sh && bash scripts/sw-remote-test.sh && bash scripts/sw-replay-test.sh && bash scripts/sw-retro-test.sh && bash scripts/sw-scale-test.sh && bash scripts/sw-stall-detector-test.sh && bash scripts/sw-security-audit-test.sh && bash scripts/sw-self-optimize-test.sh && bash scripts/sw-session-test.sh && bash scripts/sw-setup-test.sh && bash scripts/sw-standup-test.sh && bash scripts/sw-status-test.sh && bash scripts/sw-strategic-test.sh && bash scripts/sw-stream-test.sh && bash scripts/sw-swarm-test.sh && bash scripts/sw-team-stages-test.sh && bash scripts/sw-templates-test.sh && bash scripts/sw-testgen-test.sh && bash scripts/sw-tmux-pipeline-test.sh && bash scripts/sw-tmux-test.sh && bash scripts/sw-trace-test.sh && bash scripts/sw-tracker-test.sh && bash scripts/sw-triage-test.sh && bash scripts/sw-upgrade-test.sh && bash scripts/sw-ux-test.sh && bash scripts/sw-webhook-test.sh && bash scripts/sw-widgets-test.sh && bash scripts/sw-worktree-test.sh && bash scripts/sw-lib-compat-test.sh && bash scripts/sw-lib-helpers-test.sh && bash scripts/sw-lib-error-actionability-test.sh && bash scripts/sw-lib-daemon-dispatch-test.sh && bash scripts/sw-lib-daemon-failure-test.sh && bash scripts/sw-lib-daemon-poll-test.sh && bash scripts/sw-lib-daemon-state-test.sh && bash scripts/sw-lib-daemon-triage-test.sh && bash scripts/sw-lib-daemon-patrol-test.sh && bash scripts/sw-lib-pipeline-detection-test.sh && bash scripts/sw-lib-pipeline-intelligence-test.sh && bash scripts/sw-lib-pipeline-quality-checks-test.sh && bash scripts/sw-lib-pipeline-stages-test.sh && bash scripts/sw-lib-pipeline-state-test.sh && bash scripts/sw-adapters-test.sh && bash scripts/sw-evidence-test.sh && bash scripts/sw-review-rerun-test.sh && bash scripts/sw-tracker-providers-test.sh && bash scripts/sw-budget-chaos-test.sh && bash scripts/sw-chaos-test.sh && bash scripts/sw-autonomous-e2e-test.sh && bash scripts/sw-memory-discovery-e2e-test.sh && bash scripts/sw-policy-e2e-test.sh && bash scripts/sw-e2e-smoke-test.sh && bash scripts/sw-dashboard-e2e-test.sh -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright pipeline test — E2E Validation (Real Subprocess) ║ -╚═══════════════════════════════════════════════════════════════════╝ +╔════════════════════════════════════════════════════════════════╗ +║ AGI-Roadmap Validation — Real Tests for Every Feature ║ +╚════════════════════════════════════════════════════════════════╝ + +Phase 1: Feedback Loops (Discovery, Memory, PM, Failure Learning) + ▸ Daemon: failure_history initialized in state JSON... ✓ + ▸ Daemon: consecutive count uses reduce (run-from-newest, not total)... ✓ + ▸ Daemon: get_max_retries_for_class returns per-class limits... ✓ + ▸ Daemon: exponential backoff formula 5*2^(n-3) is correct... ✓ + ▸ Daemon: resume_after parsed in UTC (not local TZ)... ✓ + ▸ PM: recommend --json flag implemented... ✓ + ▸ PM: learn subcommand functional... ✓ + ▸ Daemon: PM recommend/learn wired into triage + success/failure... ✓ + ▸ Daemon: confidence < 60% upgrades to full template... ✓ + +Phase 2: Agent Coordination (Feedback, Predictive, Oversight, Autonomous) + ▸ Feedback: ARTIFACTS_DIR respects caller override... ✓ + ▸ Feedback: rollback uses PIPESTATUS for correct exit code... ✓ + ▸ Predictive: anomaly detection returns severity for 5x baseline... ✓ + ▸ Predictive: inject-prevention command exists and runs... ✓ + ▸ Pipeline: predictive anomaly/baseline/inject-prevention wired... ✓ + ▸ Pipeline: memory metric wired into stage completion... ✓ + ▸ Oversight: gate approves clean review... ✓ + ▸ Oversight: gate rejects with --reject-if... ✓ + ▸ Oversight: gate JSON safe from newline/quote injection... ✓ + ▸ Pipeline: oversight gate wired + respects SKIP_GATES... ✓ + ▸ Pipeline: feedback collect/create-issue/rollback wired into monitor... ✓ + ▸ Pipeline: intelligence prediction validation wired... ✓ + ▸ Pipeline: predictive anomaly confirmation wired... ✓ + ▸ Pipeline: memory fix-outcome negative path wired... ✓ + ▸ Triage: offline fallback with recruit... ✓ + ▸ Recruit: policy integration wired... ✓ + ▸ Recruit: meta feedback loop... ✓ + ▸ Recruit: audit (negative-compounding feedback)... ✓ + ▸ Autonomous: safe recruit_args quoting... ✓ + ▸ Autonomous: Claude output redirected to findings file... ✓ + ▸ Autonomous: dual branch check (pipeline + daemon)... ✓ + ▸ Autonomous: run_scheduler with loop/sleep... ✓ + ▸ Autonomous: trigger_pipeline_for_finding wired... ✓ + ▸ Incident: create_hotfix_issue echoes issue number... ✓ + ▸ Incident: trigger_pipeline wires --template hotfix... ✓ + ▸ Incident: trigger_rollback wires sw-feedback.sh... ✓ + +Phase 3: Quality Assurance (Code Review, Testgen, Swarm, Multi-Agent) + ▸ Code review: run_claude_semantic_review exists... ✓ + ▸ Code review: semantic findings integrated... ✓ + ▸ Testgen: Claude prompt asks for real assertions... ✓ + ▸ Testgen: prompt piped to avoid shell expansion... ✓ + ▸ Swarm: spawn creates tmux session... ✓ + ▸ Swarm: retire kills tmux session... ✓ + ▸ Swarm: spawn/retire functional (real tmux)... ✓ + ▸ Loop: multi-agent restarts not blocked... ✓ + +Phase 4: Meta-Cognition (Effectiveness, Self-Awareness, Capability) + ▸ Pipeline: record_stage_effectiveness creates valid JSONL... ✓ + ▸ Pipeline: self-awareness hint triggers on >50% failure rate... ✓ + ▸ Pipeline: effectiveness recorded on both complete and failed... ✓ + ▸ Pipeline: discovery inject wired... ✓ + ▸ Pipeline: self-awareness hint injected into plan prompt... ✓ + +Integration & Safety + ▸ CI: integration-claude jobs in workflow... ✓ + ▸ Integration-claude: skip path functional... ✓ + ▸ All modified scripts have strict mode... ✓ + ▸ All modified scripts have ERR trap... ✓ + ▸ No hardcoded secrets in scripts... ✓ + + ────────────────────────────────────────── + + All 53 tests passed! + + + Shipwright Activity Tests + ══════════════════════════════════════════ + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ Source guard pattern (if/then/fi) + ✓ VERSION variable defined + + Help Output + ✓ help exits 0 and contains USAGE + ✓ help lists 'watch' subcommand + ✓ help lists 'snapshot' subcommand + ✓ help lists 'history' subcommand + ✓ help lists 'stats' subcommand + ✓ help lists 'agents' subcommand + ✓ --help alias works + + Error Handling +shipwright activity — Live agent activity stream + +USAGE + shipwright activity [subcommand] [options] + +SUBCOMMANDS + watch Live stream of agent activity (default) + snapshot Current state of all active agents + history [range] Replay past activity (e.g., '1h', '10m', 'all') + stats Running counters (events, commits, tests, agents) + agents List known agents and last activity + help Show this help message + +OPTIONS + --type Filter events by type (e.g., 'stage.completed') + --agent Filter by agent name + --team Filter by team + --stage Filter by pipeline stage (e.g., 'build') + +EXAMPLES + \033[2mshipwright activity\033[0m # Live stream + \033[2mshipwright activity watch --type stage.*\033[0m # Only stage events + \033[2mshipwright activity history 1h\033[0m # Last hour + \033[2mshipwright activity snapshot\033[0m # Current state + \033[2mshipwright activity stats\033[0m # Counters + ✓ Unknown command exits non-zero + + Subcommands Without Events File + ✓ snapshot exits non-zero with no events + ✓ stats exits non-zero with no events + ✓ agents exits non-zero with no events + + Event Emission + ✓ emit_event creates events.jsonl + ✓ emit_event writes valid JSON + ✓ Event contains type field + ✓ Event contains agent field + ✓ Event contains numeric count + + Format Helpers + ✓ get_icon_for_type commit returns icon + ✓ get_icon_for_type test.passed returns icon + ✓ get_icon_for_type unknown returns bullet + ✓ format_timestamp strips T and Z + + Stats With Events + ✓ stats shows Total Events + ✓ stats shows Commits count + ✓ stats shows Pipelines count + + History Subcommand + ✓ history all shows activity header + + + ────────────────────────────────────────── + + All 28 tests passed + + + Shipwright Adaptive Tests + ══════════════════════════════════════════ + + help / version + ✓ help exits 0 + ✓ help shows USAGE + ✓ help shows SUBCOMMANDS + ✓ help mentions get + ✓ help mentions train + ✓ help mentions profile + ✓ version exits 0 + ✓ version output contains version string + + error handling + ✓ Unknown command exits non-zero + + get command + ✓ get timeout with default exits 0 + + profile command + ✓ profile exits 0 + + reset command + ✓ reset exits 0 + + script safety + ✓ Uses set -euo pipefail + ✓ Has source guard pattern + + statistical functions + ✓ percentile, mean, median functions defined in source + ✓ mean returns numeric value (avg of 1-5 is 3) + + get_timeout / get_iterations / get_model + ✓ get_timeout returns number (default with no events) + ✓ get_iterations returns number + ✓ get_model returns valid model name + + train subcommand + ✓ train subcommand runs with mock events + + ────────────────────────────────────────── + + All 20 tests passed + + + + shipwright adversarial test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ Source guard pattern (if/then/fi) + ✓ VERSION variable defined + + Help Output + ✓ help exits 0 and contains USAGE + ✓ help lists 'review' subcommand + ✓ help lists 'iterate' subcommand + ✓ help mentions adversarial_enabled flag + ✓ --help alias works + ✓ -h alias works + + Error Handling + ✓ Unknown command exits non-zero + + Review Subcommand + ✓ review disabled returns empty JSON array + ✓ review without diff arg exits non-zero + + Iterate Subcommand + ✓ iterate without args exits non-zero + ✓ iterate with empty findings converges + ✓ iterate past max rounds returns findings + + Configuration + ✓ ADVERSARIAL_MAX_ROUNDS env var respected + ✓ disabled config returns empty array + + Event Emission + ✓ emit_event creates events.jsonl + ✓ emit_event writes valid JSON + ✓ Event contains type field + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + shipwright architecture-enforcer test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ Source guard pattern (if/then/fi) + ✓ VERSION variable defined + + Help Output + ✓ help exits 0 and contains USAGE + ✓ help lists 'build' subcommand + ✓ help lists 'validate' subcommand + ✓ help lists 'evolve' subcommand + ✓ help mentions architecture_enabled flag + ✓ --help alias works + ✓ -h alias works + + Error Handling + ✓ Unknown command exits non-zero + + Build Subcommand + ✓ build disabled returns empty JSON object + ✓ build with enabled returns model with layers + ✓ build model contains patterns array + ✓ build model contains conventions array + + Validate Subcommand + ✓ validate disabled returns empty array + ✓ validate without diff arg exits non-zero + ✓ validate without model file returns empty array + + Evolve Subcommand + ✓ evolve disabled exits 0 + ✓ evolve without model exits 0 + + Model Storage + ✓ build stores model in ~/.shipwright/memory/ + ✓ Stored model is valid JSON + + Event Emission + ✓ emit_event creates events.jsonl + ✓ emit_event writes valid JSON + ✓ Event contains type field + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + Shipwright Auth Tests + ══════════════════════════════════════════ + + Help + ✓ help shows usage + ✓ help shows commands + ✓ --help flag works + ✓ unknown command shows error + Status & Users + ✓ status with no login shows warning + ✓ users with empty shows warning + ✓ token with no user errors + ✓ switch without user exits non-zero + Auth Storage + ✓ auth file has users array + ✓ auth file has active_user + ✓ users lists stored user + ✓ token shows stored token + ✓ user info shows login + ✓ switch to nonexistent errors + ✓ logout succeeds + + ────────────────────────────────────────── + + All 15 tests passed + + + + Shipwright Autonomous Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ unknown command exits 1 + ✓ start shows running message + ✓ state.json created after start + ✓ state status is running + ✓ config.json created + ✓ config is valid JSON + ✓ status shows dashboard + ✓ pause sets status to paused + ✓ resume sets status to running + ✓ stop sets status to stopped + ✓ config show displays settings + ✓ config set interval works + ✓ history handles no data + + Config Set/Show Cycle + ✓ config show reflects set value + ✓ config value persists in file + + Config Structure + ✓ config contains key: cycle_interval_minutes + ✓ config contains key: max_issues_per_cycle + ✓ config contains key: daemon_aware + + History With Fixture Events + ✓ history shows recent cycles + ✓ history shows cycle entries + + Status Fields + ✓ status when running includes Status + ✓ status when running includes Cycles + ✓ status when running includes Issues Created + ✓ status when running includes Pipelines + ✓ status when running includes Cycle Interval + ✓ status when running shows running + ✓ status when stopped shows stopped + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Changelog Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help / version + ✓ help exits 0 + ✓ help shows USAGE + ✓ help mentions generate + ✓ help mentions preview + ✓ help mentions version + ✓ help mentions migrate + ✓ VERSION variable defined + + error handling + ✓ Unknown command exits non-zero + + formats command + ✓ formats exits 0 + + generate command + ✓ generate exits 0 + + version command + ✓ version recommendation exits 0 + + script safety + ✓ Uses set -euo pipefail + ✓ Has source guard pattern + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + shipwright checkpoint test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ VERSION variable defined + + Help Output + ✓ help exits 0 and contains USAGE + ✓ help lists 'save' subcommand + ✓ help lists 'restore' subcommand + ✓ help lists 'list' subcommand + ✓ help lists 'clear' subcommand + ✓ help lists 'expire' subcommand + ✓ --help alias works + ✓ -h alias works + + Error Handling + +shipwright checkpoint v3.2.4 — Save and restore agent state mid-stage + +USAGE + shipwright checkpoint [options] + +COMMANDS + save Save a checkpoint for a stage + restore Restore a checkpoint (prints JSON to stdout) + save-context Save Claude context (goal, findings, test output) for resume + restore-context Restore Claude context (exports RESTORED_* and SW_LOOP_* vars) + list Show all available checkpoints + clear Remove checkpoint(s) + expire Remove checkpoints older than N hours + +SAVE OPTIONS + --stage Stage name (required) + --iteration Current iteration number + --git-sha Git commit SHA (default: HEAD) + --files-modified "f1,f2" Comma-separated list of modified files + --tests-passing Mark tests as passing + --loop-state Loop state (running, paused, etc.) + +RESTORE OPTIONS + --stage Stage to restore (required) + +CLEAR OPTIONS + --stage Stage to clear + --all Clear all checkpoints + +EXPIRE OPTIONS + --hours Max age in hours (default: 24) + +EXAMPLES + shipwright checkpoint save --stage build --iteration 5 + shipwright checkpoint save --stage build --iteration 3 --tests-passing --files-modified "src/auth.ts,src/middleware.ts" + shipwright checkpoint restore --stage build + shipwright checkpoint list + shipwright checkpoint clear --stage build + shipwright checkpoint clear --all + shipwright checkpoint expire --hours 48 + ✓ Unknown command exits non-zero + + Save Subcommand + +shipwright checkpoint v3.2.4 — Save and restore agent state mid-stage + +USAGE + shipwright checkpoint [options] + +COMMANDS + save Save a checkpoint for a stage + restore Restore a checkpoint (prints JSON to stdout) + save-context Save Claude context (goal, findings, test output) for resume + restore-context Restore Claude context (exports RESTORED_* and SW_LOOP_* vars) + list Show all available checkpoints + clear Remove checkpoint(s) + expire Remove checkpoints older than N hours + +SAVE OPTIONS + --stage Stage name (required) + --iteration Current iteration number + --git-sha Git commit SHA (default: HEAD) + --files-modified "f1,f2" Comma-separated list of modified files + --tests-passing Mark tests as passing + --loop-state Loop state (running, paused, etc.) + +RESTORE OPTIONS + --stage Stage to restore (required) + +CLEAR OPTIONS + --stage Stage to clear + --all Clear all checkpoints + +EXPIRE OPTIONS + --hours Max age in hours (default: 24) + +EXAMPLES + shipwright checkpoint save --stage build --iteration 5 + shipwright checkpoint save --stage build --iteration 3 --tests-passing --files-modified "src/auth.ts,src/middleware.ts" + shipwright checkpoint restore --stage build + shipwright checkpoint list + shipwright checkpoint clear --stage build + shipwright checkpoint clear --all + shipwright checkpoint expire --hours 48 + ✓ save without --stage exits non-zero +✓ Checkpoint saved for stage build (iteration 5) + ✓ save creates checkpoint file + ✓ Checkpoint is valid JSON + ✓ Checkpoint stage field correct + ✓ Checkpoint iteration field correct + ✓ Checkpoint git_sha populated +✓ Checkpoint saved for stage test (iteration 0) + ✓ save --tests-passing sets true +✓ Checkpoint saved for stage review (iteration 0) + ✓ save --files-modified stores 2 files +✓ Checkpoint saved for stage deploy (iteration 0) + ✓ save --loop-state stores state + ✓ Checkpoint created_at timestamp present + + Restore Subcommand + ✓ restore returns checkpoint JSON + ✓ Restored checkpoint has correct stage + ✓ restore missing stage exits non-zero + ✓ restore without --stage exits non-zero + + List Subcommand + ✓ list shows Checkpoints header + ✓ list shows build checkpoint + ✓ list shows checkpoint count + ✓ list with no checkpoints shows empty + + Clear Subcommand +✓ Checkpoint saved for stage build (iteration 1) +✓ Checkpoint saved for stage test (iteration 2) +✓ Cleared checkpoint for stage build + ✓ clear --stage removes specific checkpoint + ✓ clear --stage preserves other checkpoints + ✓ clear without args exits non-zero +✓ Checkpoint saved for stage build (iteration 3) +✓ Cleared 2 checkpoint(s) + ✓ clear --all removes all checkpoints + + Expire Subcommand + ✓ expire with no checkpoints exits 0 + + Save-context / Restore-context + ✓ save-context creates claude-context.json + ✓ Context goal saved correctly + ✓ Context iteration saved correctly + ✓ restore-context exports RESTORED_GOAL + ✓ restore-context exports SW_LOOP_GOAL + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright CI Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ version shows version + ✓ unknown command exits 1 + ✓ generate starts processing + ✓ validate runs on valid workflow + ✓ VERSION is defined + ✓ analyze shows analysis + ✓ analyze shows cache info + ✓ matrix generates config + ✓ matrix workflow file exists + ✓ validate passes on valid workflow + ✓ runners list shows options + ✓ runners recommend shows guidance + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Cleanup Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help + ✓ help exits 0 + ✓ help shows usage + ✓ help shows --force + ✓ help shows dry-run + + Dry-Run (Empty) + ✓ dry-run shows Tmux Windows section + ✓ dry-run shows Team Configs section + ✓ dry-run shows Task Lists section + ✓ dry-run shows Pipeline Artifacts section + ✓ dry-run shows Pipeline State section + ✓ dry-run shows Heartbeats section + ✓ dry-run reports clean + + Force Mode (Empty) + ✓ force shows FORCE MODE + ✓ force reports nothing to clean + + Detect Team Configs + ✓ dry-run detects team dir + ✓ dry-run shows would remove + + Force Removes Teams + ✓ force removes team dir + ✓ team directory actually removed + + Detect Task Lists + ✓ dry-run detects task dir + ✓ task directory actually removed + + Detect Stale Heartbeats + ✓ dry-run detects stale heartbeat + + Error Handling + ✓ unknown option exits non-zero + ✓ unknown option shows error + + Summary Counting + ✓ summary shows found count + ✓ summary shows --force hint + + + + ────────────────────────────────────────── + + All 24 tests passed + + + + Shipwright Code Review Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ review runs and starts reviewing + ✓ trends with no data + ✓ config show outputs valid config + ✓ unknown subcommand exits nonzero + ✓ no false long function detection on small file + ✓ style consistency check runs without crash + ✓ architecture boundary check runs without crash + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright connect — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Identity Resolution + ▸ resolve_developer_id from DEVELOPER_ID env... ✓ + ▸ resolve_developer_id from git config... ✓ + ▸ resolve_developer_id fallback to USER... ✓ + ▸ resolve_machine_name from MACHINE_NAME env... ✓ + ▸ resolve_machine_name from hostname... ✓ + +Dashboard URL Resolution + ▸ resolve_dashboard_url from --url flag... ✓ + ▸ resolve_dashboard_url from DASHBOARD_URL env... ✓ + ▸ resolve_dashboard_url from team-config.json... ✓ + ▸ resolve_dashboard_url falls back to default... ✓ + +Start/Stop Lifecycle + ▸ cmd_start creates PID file... ▸ Starting connect to http://localhost:8767 +▸ Developer: test-developer @ test-machine +✓ Connect started (PID 36454) + Logs: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-connect-test.pslDt2/home/.shipwright/connect.log + Stop: shipwright connect stop +✓ + ▸ cmd_start rejects if already running... Stop it first: shipwright connect stop +✓ + ▸ cmd_stop removes PID file... ⚠ Process 99999 not running — cleaning up stale PID file +✓ + ▸ cmd_stop handles missing PID gracefully... ⚠ Connect is not running (no PID file) +✓ + +Status + ▸ cmd_status shows connected when PID alive... ✓ + ▸ cmd_status shows disconnected when no PID... ✓ + +Join Flow + ▸ cmd_join verifies token against dashboard... ✓ + ▸ cmd_join saves team-config.json... ✓ + ▸ cmd_join rejects invalid token... ✓ + ▸ cmd_join accepts --url and --token flags... ✓ + +Heartbeat & Disconnect Payloads + ▸ Heartbeat payload includes required fields... ✓ + ▸ Send disconnect sends proper payload... ✓ + +Configuration & Utilities + ▸ ensure_dir creates shipwright directory... ✓ + ▸ now_iso returns valid ISO timestamp... ✓ + ▸ Script has correct version... ✓ + +Integration + ▸ Help command shows all main commands... ✓ + +════════════════════════════════════════════════════ + All 25 tests passed ✓ +════════════════════════════════════════════════════ + + + Context Budget Monitor Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Budget Initialization + ✓ init creates config (exit 0) + ✓ config has correct total_budget + ✓ system_reserve is 10% (80000) + ✓ tools_reserve is 10% (80000) + ✓ working_memory is 60% (480000) + ✓ output_reserve is 20% (160000) + + Token Estimation + ✓ estimation produces JSON + ✓ estimate contains utilization_percent + ✓ estimate contains total_used (>0) + + Status Checking (Thresholds) + ✓ 50% utilization returns green status + ✓ 70% utilization returns yellow status + ✓ 85% utilization returns red status + ✓ 95% utilization returns critical status + + Context Trimming + ✓ green status doesn't trim content + ✓ yellow status reduces content length + ✓ hard truncate respects size limit + + Iteration Summarization + ✓ summarize_iteration succeeds (exit 0) + ✓ iteration summary file created with 1 entry + + Budget Report Generation + ✓ report contains budget_config + ✓ report contains iteration_summaries + + State Logging + ✓ log_state succeeds (exit 0) + ✓ log file created with entries + + + ────────────────────────────────────────── + + All 22 tests passed + + + + Shipwright Context Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows commands + ✓ --help exits 0 + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + + show subcommand + ✓ show exits 0 + ✓ show outputs context header + + clear subcommand + ✓ clear exits 0 + ✓ clear confirms cleared + + gather subcommand + ✓ gather without args exits 1 + ✓ gather shows must provide + ✓ gather with unknown option exits 1 + + gather with goal + ✓ gather exits 0 + ✓ gather shows building + ✓ gather shows success + ✓ context-bundle.md created + + show after gather + ✓ show after gather exits 0 + ✓ show contains pipeline context + + clear after gather + ✓ clear exits 0 + + gather with issue + ✓ gather with --issue exits 0 + ✓ gather shows building + + gather default stage + ✓ gather default stage exits 0 + + internal stage_guidance + ✓ stage_guidance returns plan guidance + ✓ stage_guidance returns build guidance + ✓ stage_guidance handles unknown stage + + + + ────────────────────────────────────────── + + All 26 tests passed + + + + Shipwright Cost Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help / version + ✓ help exits 0 + ✓ help shows USAGE + ✓ help shows COMMANDS + ✓ help mentions show + ✓ help mentions budget + ✓ help mentions calculate + ✓ VERSION variable defined + + state management + ✓ costs.json created on first use + ✓ budget.json created on first use + ✓ costs.json has entries array + ✓ budget.json has daily_budget_usd + + budget commands + ✓ budget set exits 0 + ✓ budget set to 50 + ✓ budget show exits 0 + + error handling + ✓ Unknown command exits non-zero + + calculate + ✓ calculate exits 0 + + script safety + ✓ Uses set -euo pipefail + ✓ ERR trap is set + + context efficiency in cost dashboard + ✓ Cost dashboard has CONTEXT EFFICIENCY section + ✓ Cost dashboard reads loop.context_efficiency events + ✓ Context efficiency reports utilization and waste + ✓ Dashboard renders CONTEXT EFFICIENCY with event data + ✓ Dashboard shows avg budget utilization + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright daemon test — Unit Tests (Synthetic Events) ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up test environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-daemon-test.1lvtWG + + ▸ dora_grade deploy_freq Elite (>= 7)... ✓ + ▸ dora_grade deploy_freq High (>= 1)... ✓ + ▸ dora_grade deploy_freq Medium (>= 0.25)... ✓ + ▸ dora_grade deploy_freq Low (< 0.25)... ✓ + ▸ dora_grade CFR thresholds (Elite/High/Medium/Low)... ✓ + ▸ Stage timings filter-first jq query... ✓ + ▸ MTTR pairs failures with next success... ✓ + ▸ epoch_to_iso helper function... ✓ + ▸ Health check detects stale jobs... scripts/sw-daemon-test.sh: line 120: 38612 Terminated: 15 sleep 300 +✓ + ▸ Priority label sorting... ✓ + ▸ Degradation alert triggers on high CFR... ✓ + ▸ Metrics --json output with cycle_time keys... ✓ + ▸ Self-labeling includes watch_label when enabled... ✓ + ▸ Self-labeling excludes watch_label when disabled... ✓ + ▸ Patrol recurring failures label construction... ✓ + ▸ DORA degradation event detection... ✓ + ▸ Retry exhaustion event detection... ✓ + ▸ Untested script detection logic... ✓ + ▸ Progress detects stage advancement... ✓ + ▸ Progress detects stuck (no change N checks)... ✓ + ▸ Progress detects repeated error loop... ✓ + ▸ Progress resets on diff growth... ✓ + ▸ Hard limit kills even with progress on... scripts/sw-daemon-test.sh: line 200: 40541 Terminated: 15 sleep 300 +✓ + ▸ Adaptive cycles extends limit on >50% issue drop... ✓ + ▸ Adaptive cycles reduces limit on issue increase... ✓ + ▸ Adaptive cycles respects 2x base hard ceiling... ✓ + ▸ Adaptive cycles no-op on first cycle... ✓ + ▸ Cleanup: Checkpoint expire removes old checkpoints... ▸ Expired: build checkpoint (1h+ old) +✓ Expired 1 checkpoint(s) older than 1h +✓ + ▸ Cleanup: Failure handler removes watch label... ✓ + ▸ Cleanup: Failure handler closes draft PRs... ✓ + ▸ Cleanup: sw-cleanup.sh has all artifact cleanup sections... ✓ + ▸ Daemon sources vitals module... ✓ + ▸ Vitals verdict maps to daemon verdict (continue→healthy etc)... ✓ + ▸ Vitals emits pipeline.vitals_check events... ✓ + ▸ Auto-scale includes vitals health factor... ✓ + ▸ Quality memory drives template selection... ✓ + ▸ Vitals-based progress with static fallback... ✓ + ▸ Memory: query fix for error returns matching fix... ✓ + ▸ Memory: DORA template escalation patterns exist... ✓ + ▸ Memory: All 12 error categories in post-tool-use.sh... ✓ + ▸ Daemon: Template weights selection reads weights file... ✓ + ▸ Daemon: Auto-enable self_optimize when auto_template is true... ✓ + ▸ Intelligence: classify_failure detects auth errors... ✓ + ▸ Intelligence: classify_failure has all 6 failure classes... ✓ + ▸ Intelligence: Retry skips auth_error and invalid_issue... ✓ + ▸ Intelligence: API errors get extended 300s backoff... ✓ + ▸ Intelligence: daemon_preflight_auth_check exists and auto-pauses... ✓ + ▸ Intelligence: Process group spawning (set -m)... ✓ + ▸ Intelligence: Process tree kill in cleanup (pkill -P)... ✓ + ▸ Intelligence: Consecutive failure auto-pause (3 threshold)... ✓ + ▸ Intelligence: Retry escalation args passed to spawn... ✓ + ▸ Intelligence: classify_failure wired into retry logic... ✓ + +━━━ Results ━━━ + Passed: 66 + Failed: 0 + Total: 52 + +All 66 tests passed! + + +Dashboard Smoke Tests + +Structure + ▸ server.ts exists... ✓ + ▸ server.ts has valid structure (imports, fetch)... ✓ + ▸ public/ directory exists... ✓ + ▸ index.html exists... ✓ + ▸ src/ directory exists... ✓ + ▸ main.ts entry point exists... ✓ + ▸ tsconfig.json exists... ✓ + ▸ src/ modules exist (core, views, components, design, types, canvas)... ✓ + +Routes + ▸ Server exports /api/health and /api/status... ✓ + ▸ Server exports /ws WebSocket route... ✓ + ▸ Server exports /api/context-efficiency... ✓ + +Integrity + ▸ bun check passes (if bun available)... ✓ + ▸ index.html references dist/main.js... ✓ + ▸ Frontend bundle builds (if bun available)... ✓ + ▸ TypeScript type check passes (if tsc available)... (typescript not installed, skipping) +✓ + +════════════════════════════════════════════════════ + All 15 tests passed ✓ +════════════════════════════════════════════════════ + + +╔════════════════════════════════════════════════════════╗ +║ shipwright db — SQLite Persistence Test Suite ║ +╚════════════════════════════════════════════════════════╝ + +Setting up test environment... + +Schema Creation + ▸ init_schema creates all tables... ✓ + ▸ Schema includes required tables... ✓ + ▸ WAL mode is enabled... ✓ + ▸ check_sqlite3 caches result... ✓ + +Database Availability + ▸ db_available returns true when ready... ✓ + ▸ db_available returns false without sqlite3... /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.8ZHWF8/sw-db.sh: line 19: dirname: command not found +✓ + +Event CRUD Operations + ▸ db_add_event inserts event... ✓ + ▸ add_event dual-writes to SQLite + JSONL... ✓ + +Daemon State Management + ▸ db_save_job saves active job... ✓ + ▸ db_complete_job marks completed... ✓ + ▸ db_fail_job marks failed... ✓ + ▸ db_list_active_jobs returns JSON array... ✓ + ▸ db_active_job_count returns count... ✓ + ▸ db_is_issue_active checks status... ✓ + +Cost Tracking + ▸ db_record_cost saves entry... ✓ + ▸ db_cost_today calculates total... ✓ + ▸ db_set_budget and db_get_budget... ✓ + +Heartbeat Management + ▸ db_record_heartbeat saves heartbeat... ✓ + ▸ db_list_heartbeats returns array... ✓ + ▸ db_clear_heartbeat deletes entry... ✓ + +Memory & Failure Tracking + ▸ db_record_failure saves pattern... ✓ + ▸ db_query_similar_failures finds matches... ✓ + +JSON Data Migration + ▸ migrate_json_data imports events... ▸ Migrating schema v0 → v2... +✓ Migrated to schema v2 +▸ Migrating schema v0 → v3... +✓ Migrated to schema v3 +▸ Migrating schema v0 → v4... +✓ Migrated to schema v4 +▸ Migrating schema v0 → v5... +✓ Migrated to schema v5 +▸ Migrating schema v0 → v6... +✓ Migrated to schema v6 +▸ Importing events from /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.8ZHWF8/home/.shipwright/events.jsonl... +✓ Events: 2 imported, 0 skipped (duplicates) +▸ Importing heartbeats... +✓ Heartbeats: 0 imported + +✓ Migration complete: 2 total records imported + +▸ Verification: + Events in DB: 2 + Cost entries: 0 + Heartbeats: 0 +✓ + ▸ migrate_json_data imports costs... ▸ Database already at schema v6 +▸ Importing events from /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.8ZHWF8/home/.shipwright/events.jsonl... +✓ Events: 2 imported, 0 skipped (duplicates) +▸ Importing costs from /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.8ZHWF8/home/.shipwright/costs.json... +✓ Costs: 2 entries imported +▸ Importing heartbeats... +✓ Heartbeats: 0 imported + +✓ Migration complete: 4 total records imported + +▸ Verification: + Events in DB: 2 + Cost entries: 2 + Heartbeats: 0 +✓ + ▸ migrate_json_data imports budget... ▸ Database already at schema v6 +▸ Importing events from /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.8ZHWF8/home/.shipwright/events.jsonl... +✓ Events: 2 imported, 0 skipped (duplicates) +▸ Importing costs from /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.8ZHWF8/home/.shipwright/costs.json... +✓ Costs: 2 entries imported +▸ Importing budget from /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.8ZHWF8/home/.shipwright/budget.json... +✓ Budget: imported ($50.00, enabled=true) +▸ Importing heartbeats... +✓ Heartbeats: 0 imported + +✓ Migration complete: 4 total records imported + +▸ Verification: + Events in DB: 2 + Cost entries: 2 + Heartbeats: 0 +✓ + +Health Checks + ▸ db_health_check validates database... ✓ + +Export & Backup + ▸ export_db creates JSON backup... ▸ Exporting database to /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.8ZHWF8/home/.shipwright/test-backup.json... +✓ Database exported to /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-db-test.8ZHWF8/home/.shipwright/test-backup.json +✓ + +Data Cleanup + ▸ cleanup_old_data removes old entries... ▸ Cleaning records older than 30 days (before 2026-02-07T11:20:48Z)... +✓ Deleted: 0 events, 0 costs, 0 daemon jobs, 0 stages +✓ + +Concurrent Operations + ▸ Concurrent writes don't corrupt DB... 5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +5000 +✓ + +Pipeline Run Tracking + ▸ add_pipeline_run creates entry... ✓ + ▸ update_pipeline_status updates run... ✓ + +══════════════════════════════════════════════════════════ + All 31 tests passed ✓ +══════════════════════════════════════════════════════════ + + + + Shipwright Decompose Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows commands + ✓ --help exits 0 + + version flag + ✓ --version exits 0 + ✓ --version shows version + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + ✓ analyze without issue exits 1 + ✓ analyze shows usage + ✓ decompose without issue exits 1 + ✓ auto without issue exits 1 + + analyze subcommand (mock) + ✓ analyze exits 0 with NO_GITHUB + ✓ analyze outputs complexity_score + ✓ analyze outputs should_decompose + ✓ analyze outputs subtasks + ✓ analyze outputs valid JSON + ✓ analyze returns complexity_score 85 + ✓ analyze returns should_decompose true + + decompose subcommand (mock) + ✓ decompose exits 0 with NO_GITHUB + ✓ decompose shows decomposing + + auto subcommand (mock) + ✓ auto exits 0 with NO_GITHUB + + state file creation + ✓ events.jsonl created + + DAG scheduling (new features) + ✓ schedule exits 0 + ✓ schedule shows valid DAG + ✓ schedule shows waves + ✓ critical-path exits 0 + ✓ critical-path shows title + ✓ critical-path shows hours + ✓ visualize text exits 0 + ✓ visualize shows DAG title + ✓ visualize shows task 0 + ✓ visualize mermaid exits 0 + ✓ visualize mermaid has graph + ✓ help shows schedule cmd + ✓ help shows critical-path cmd + ✓ help shows visualize cmd + ✓ version shows 3.2.4 + ✓ mock data includes depends_on field + + + + ────────────────────────────────────────── + + All 39 tests passed + + + sw-decide Tests + ══════════════════════════════════════════ + + + help + ✓ help shows usage + ✓ help mentions run + ✓ help mentions status + ✓ help mentions tiers + ✓ help mentions candidates + ✓ help mentions halt + ✓ help mentions resume + ✓ help mentions dry-run + ✓ --help shows usage + + tiers + ✓ tiers shows auto + ✓ tiers shows propose + ✓ tiers shows draft + ✓ tiers shows category rules + ✓ tiers shows limits + + signals + ✓ candidate has id + ✓ candidate has signal + ✓ candidate has category + ✓ candidate has risk_score + ✓ candidate has dedup_key + ✓ candidate has collected_at + ✓ candidate is valid JSON + ✓ read_pending returns data + ✓ clear_pending empties file + + scoring + ✓ scored has value_score + ✓ scored has scores object + ✓ critical security scores well (75) + ✓ security scores higher than dead_code (75 > 33) + ✓ impact weight loaded + ✓ urgency weight loaded + + autonomy + ✓ deps_patch -> auto + ✓ refactor_hotspot -> propose + ✓ new_feature -> draft + ✓ unknown -> draft + ✓ auto labels include shipwright + ✓ auto labels include ready-to-build + ✓ propose labels include proposed +jq: parse error: Unmatched '}' at line 1, column 111 +jq: parse error: Unmatched '}' at line 1, column 111 + ✓ budget available with no decisions +jq: parse error: Unmatched '}' at line 1, column 111 + ✓ budget exhausted after 16 issues + ✓ rate limit passes with no history +jq: parse error: Unmatched '}' at line 1, column 111 + ✓ rate limit blocks recent decision + ✓ not halted initially + ✓ halted after halt() + ✓ halt file created + ✓ resumed after resume() + + risk ceiling + ✓ risk 20 below ceiling 30 + ✓ risk 35 above ceiling 30 + + dry-run + ✓ dry-run shows DRY RUN + ✓ no daily log created in dry-run + ✓ no drafts in dry-run + + decision log + ✓ daily log exists + ✓ log has decision id + ✓ log has value_score + ✓ last-decision written + + outcome learning + ✓ urgency weight adjusted on security success (25) + ✓ risk weight increased on failure (13 >= 13) + ✓ weights sum to 100 + ✓ weights file written + + candidates + ✓ candidates shows title + ✓ candidates shows signal + + halt/resume CLI + ✓ halt succeeds + ✓ run blocked when halted + ✓ resume succeeds + + status + ✓ status shows active + ✓ status shows decisions + ✓ status shows budget + ✓ status shows weights + + cycle integration + ✓ cycle shows Decision Engine + ✓ cycle shows Cycle Complete + ✓ cycle processes candidates + ✓ cycle shows AUTO tier + ✓ cycle shows DRAFT tier + + log command + ✓ log shows today's date + ✓ log shows entry + + ────────────────────────────────────────── + + All 33 tests passed + + + + Shipwright Deps Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows commands + ✓ --help exits 0 + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + + scan subcommand + ✓ scan exits 0 with NO_GITHUB + ✓ scan shows warning + + classify subcommand + ✓ classify without args exits 1 + ✓ classify shows usage + ✓ classify exits 0 with NO_GITHUB + ✓ classify shows warning + + batch subcommand + ✓ batch exits 0 with NO_GITHUB + ✓ batch shows warning + + report subcommand + ✓ report exits 0 with NO_GITHUB + ✓ report shows warning + + merge subcommand + ✓ merge without args exits 1 + + test subcommand + ✓ test without args exits 1 + + internal parse_version_bump + ✓ parse_version_bump detects patch + ✓ parse_version_bump detects minor + ✓ parse_version_bump detects major + ✓ parse_version_bump handles v prefix + + + + ────────────────────────────────────────── + + All 22 tests passed + + + + shipwright developer-simulation test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + +Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ source guard uses if/then/fi pattern + +Version + ✓ VERSION variable defined + +Help Output + ✓ help contains USAGE + ✓ help contains review subcommand + ✓ help contains address subcommand + ✓ help contains PERSONAS + ✓ help contains simulation_enabled + ✓ --help flag works + +Error Handling + ✓ unknown command exits non-zero + ✓ unknown command mentions error + +Review Subcommand + ✓ review warns when disabled + ✓ review returns empty JSON array when disabled + +Address Subcommand + ✓ address with no objections succeeds + ✓ address returns empty JSON + +Persona Definitions + ✓ security persona defined + ✓ performance persona defined + ✓ maintainability persona defined + +Configuration + ✓ SIMULATION_MAX_ROUNDS env var supported + ✓ daemon-config.json checked + +Event Emission + ✓ emits simulation.objection events + ✓ emits simulation.complete events + ✓ emits simulation.addressed events + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Discovery Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows commands + ✓ --help exits 0 + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + ✓ broadcast without args exits 1 + ✓ query without args exits 1 + ✓ inject without args exits 1 + + broadcast subcommand + ✓ broadcast exits 0 + ✓ broadcast confirms + ✓ discoveries.jsonl created + ✓ discoveries.jsonl contains valid JSON + + query subcommand + ✓ query exits 0 + ✓ query finds discovery + ✓ query non-match exits 0 + ✓ query reports no discoveries + + status subcommand + ✓ status exits 0 + ✓ status shows total + + clean subcommand + ✓ clean exits 0 + ✓ clean reports result + + inject subcommand + ✓ inject exits 0 + + internal patterns_overlap + ✓ patterns_overlap matches same pattern + ✓ patterns_overlap rejects different paths + + prioritize subcommand + ✓ prioritize security exits 0 + ✓ prioritize assigns P0 + ✓ prioritize assigns P3 + + score subcommand + ✓ score exits 0 + ✓ score returns numeric result: 89 + + acknowledge subcommand + ✓ acknowledge exits 0 + ✓ consumption file created + ✓ consumption count incremented + + consumption stats + ✓ consumption stats valid JSON + ✓ consumption count correct + + memory promotion + ✓ promotion function exists + + fleet broadcast + ✓ fleet broadcast function exists + + + + ────────────────────────────────────────── + + All 36 tests passed + + + + Shipwright Doc Fleet Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help + ✓ help shows title + ✓ help shows commands section + ✓ help shows fleet roles section + ✓ help shows examples + ✓ --help flag works + ✓ unknown command shows error + Roles + ✓ roles lists doc-architect + ✓ roles lists claude-md + ✓ roles lists strategy-curator + ✓ roles lists pattern-writer + ✓ roles lists readme-optimizer + Audit + ✓ audit shows health header + ✓ audit shows health score + ✓ audit checks doc inventory + ✓ audit checks CLAUDE.md + ✓ audit checks agent roles + ✓ audit creates state file + ✓ audit records health score (91%) + Launch + ✓ launch dry-run shows header + ✓ launch dry-run mentions dry-run + ✓ launch dry-run lists doc-architect + ✓ launch dry-run lists claude-md + ✓ launch dry-run lists strategy-curator + ✓ launch dry-run lists pattern-writer + ✓ launch dry-run lists readme-optimizer + ✓ launch dry-run shows agent count + ✓ launch specific role shows role + ✓ launch specific role shows 1 agent + ✓ launch invalid role shows error + Status + ✓ status shows header + ✓ status shows last run + ✓ status shows health score + ✓ status shows session list + Manifest + ✓ manifest shows generation + ✓ manifest file created + ✓ manifest has documents (16) + Report + ✓ report shows header + ✓ report shows inventory + ✓ report shows volume + ✓ report shows fleet state + ✓ report json shows JSON + ✓ JSON report file created + Retire + ✓ retire shows retiring + ✓ retire shows count + State Persistence + ✓ status shows run count after commands + Events + ✓ doc_fleet events logged (8 events) + Aliases + ✓ start alias works + ✓ stop alias works + + + + ────────────────────────────────────────── + + All 48 tests passed + + + + Shipwright Docs Agent Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help + ✓ help shows usage + ✓ help shows commands + ✓ --help flag works + ✓ unknown command shows error + Coverage + ✓ coverage shows header + ✓ coverage shows total scripts + API Reference + ✓ api generation succeeds + ✓ api file has title + Wiki + ✓ wiki generation succeeds + ✓ wiki directory created + Scan + ✓ scan shows scanning + Sync + ✓ sync shows sync complete + Impact + ✓ impact shows analysis + State + ✓ docs-agent home directory exists + + + + ────────────────────────────────────────── + + All 14 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright docs — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +AUTO Section Discovery + ▸ find_auto_files discovers CLAUDE.md... PASS + ▸ get_sections extracts section IDs... PASS + +Section Generators + ▸ Architecture table has headers and scripts... PASS + ▸ Architecture table includes CLI router... PASS + ▸ Feature flags table with intelligence config... PASS + ▸ Test suites table picks up test files... PASS + +Section Check & Replace + ▸ check_section detects fresh vs stale... PASS + ▸ replace_section updates content between markers... PASS + +Subcommands + ▸ docs check returns 1 when stale... PASS + ▸ docs sync then check returns 0 (fresh)... PASS + ▸ docs sync is idempotent... PASS + +CLI & Help + ▸ Help output contains all subcommands... PASS + ▸ Unknown command exits with error... PASS + ▸ Default (no args) shows help... PASS + +Edge Cases + ▸ No AUTO markers returns 0... PASS + ▸ Multiple sections all get processed... PASS + ▸ Purpose extracted from script headers... PASS + ▸ Wiki dry-run succeeds... PASS + + ────────────────────────────────────────── + 18 passed 0 failed (18 total) + + + + Shipwright Doctor Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + execution + ✓ doctor runs without crash + ✓ output shows PREREQUISITES + ✓ detects tmux + ✓ detects jq + ✓ detects Claude Code CLI + ✓ detects git + + structure + ✓ VERSION variable defined + ✓ Uses set -euo pipefail + ✓ ERR trap is set + ✓ check_pass helper defined + ✓ check_fail helper defined + ✓ output shows Shipwright header + + check logic for tools + ✓ Source checks for tmux + ✓ Source checks for jq + ✓ Source checks for Claude CLI + ✓ Source checks for git + ✓ Source checks for gh + + version flag + ✓ --version outputs sw-doctor and version + ✓ -V short flag works + + missing tool handling + ✓ Doctor reports when jq missing from PATH + ✓ output includes PREREQUISITES section + ✓ output includes INSTALLED FILES section + + auto-fix mode + ✓ --fix-dry flag shows dry-run output +✗ Overlay not found: ~/.tmux/shipwright-overlay.conf +✗ No ~/.claude/settings.json — agent teams not configured +✗ Overlay not found — pane display features unavailable +✗ iTerm2 mouse reporting is DISABLED — tmux cannot receive mouse clicks + ✓ --fix creates .claude directories + ✓ --fix creates ~/.shipwright directories + ✓ --fix creates valid daemon-config.json + ✓ --fix creates valid settings.json + ✓ --fix creates valid budget.json +✗ No ~/.claude/settings.json — agent teams not configured +✗ iTerm2 mouse reporting is DISABLED — tmux cannot receive mouse clicks + ✓ --fix is idempotent (second run succeeds) +✗ No ~/.claude/settings.json — agent teams not configured +✗ iTerm2 mouse reporting is DISABLED — tmux cannot receive mouse clicks + ✓ --fix handles existing config files safely + ✓ --fix without other args works + ✓ Auto-fix output shows AUTO-FIX SUMMARY + ✓ Auto-fix reports what was fixed + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + shipwright dora test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + +Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ source guard uses if/then/fi pattern + +Version + ✓ VERSION variable defined + +Help Output + ✓ help contains USAGE + ✓ help contains show subcommand + ✓ help contains dx subcommand + ✓ help contains ai subcommand + ✓ help contains trends subcommand + ✓ help contains compare subcommand + ✓ help contains export subcommand + ✓ help contains DORA BANDS + ✓ --help flag works + +Error Handling + ✓ unknown command exits non-zero + ✓ unknown command error message + +Show Subcommand (no events) + ✓ show displays DORA Metrics + ✓ show displays Deploy Frequency + ✓ show displays Lead Time + ✓ show displays Change Failure Rate + ✓ show displays MTTR + +DX Subcommand + ✓ dx displays Developer Experience + +AI Subcommand + ✓ ai displays AI Performance Metrics + +Export Subcommand + ✓ export produces JSON with timestamp + ✓ export contains current_period + ✓ export contains previous_period + +Trends Subcommand + ✓ trends displays Trends heading + +Compare Subcommand + ✓ compare displays Period Comparison + +DORA Band Classification + ✓ classify_band function defined + ✓ Elite band classification + ✓ High band classification + ✓ Medium band classification + ✓ Low band classification + +Trend Arrows + ✓ trend_arrow function defined + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Durable Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help / version + ✓ help exits 0 + ✓ help shows USAGE + ✓ help mentions publish + ✓ help mentions consume + ✓ help mentions checkpoint + ✓ help mentions lock + ✓ help mentions compact + ✓ help mentions status + ✓ VERSION variable defined + + publish events + ✓ publish exits 0 + ✓ WAL file created + ✓ Event written to WAL + + error handling + ✓ publish without args exits non-zero + ✓ Unknown command exits non-zero + + status command + ✓ status exits 0 + + checkpointing + ✓ checkpoint save exits 0 + ✓ Checkpoint file created + ✓ checkpoint restore exits 0 + + distributed locks + ✓ lock acquire exits 0 + ✓ lock release exits 0 + + script safety + ✓ Uses set -euo pipefail + ✓ Has source guard pattern + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright E2E Orchestrator Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ unknown command exits nonzero + ✓ registry file created on init + ✓ registry is valid JSON + ✓ registry has >= 3 default suites + ✓ register adds suite + ✓ duplicate register fails + ✓ quarantine adds test + ✓ quarantine list has entry + ✓ report handles no results + ✓ flaky handles no history + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Eventbus Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows subcommands + ✓ --help exits 0 + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + + status subcommand + ✓ status exits 0 with empty bus + ✓ status shows title + + publish subcommand + ✓ publish exits 0 + ✓ publish confirms + ✓ events.jsonl created + ✓ events.jsonl contains published event type + ✓ events.jsonl contains correlation_id + ✓ eventbus has 3+ events after multi-publish + + status with events + ✓ status with events exits 0 + ✓ status shows total events + ✓ status shows events by type + + clean subcommand + ✓ clean exits 0 + ✓ clean reports result + + replay subcommand + ✓ replay exits 0 + ✓ replay shows replaying + + publish error handling + ✓ publish with empty type exits 1 + ✓ watch with missing dir exits 1 + ✓ watch shows dir error + + + + ────────────────────────────────────────── + + All 24 tests passed + + + + Shipwright Feedback Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows subcommands + ✓ --help exits 0 + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + + collect subcommand + ✓ collect on empty dir exits 0 + ✓ collect shows collecting + ✓ collect shows save path + + collect with error log + ✓ collect with errors exits 0 + ✓ collect reports errors + + analyze subcommand + ✓ analyze missing file exits 1 + ✓ analyze shows not found + ✓ analyze exits 0 + ✓ analyze shows report + + learn subcommand + ✓ learn exits 0 + ✓ learn confirms capture + ✓ incidents.jsonl created + ✓ incidents.jsonl has valid JSONL + + report subcommand + ✓ report exits 0 + ✓ report shows incidents + ✓ report shows total + ✓ report no incidents exits 0 + ✓ report says no incidents + + create-issue subcommand + ✓ create-issue with NO_GITHUB exits 0 + ✓ create-issue skips with NO_GITHUB + + post-merge monitoring + ✓ post-merge exits 0 + ✓ post-merge shows monitoring + ✓ post-merge creates monitoring file + ✓ monitoring has merge_sha + ✓ monitoring has environment + + regression detection + ✓ regression detection outputs valid JSON + ✓ no regression flag + ✓ deploy failure detects regression + ✓ deploy failure is P0 + ✓ deploy failure type + ✓ error spike detects regression + ✓ error spike is P1 + ✓ error spike type + + correlate with changes + ✓ correlate outputs valid JSON + ✓ correlation has pr_number + + learn from outcome + ✓ outcomes exits 0 + ✓ outcomes shows recording + ✓ outcomes creates merge-outcomes file + ✓ outcomes file has valid JSONL + ✓ outcome has pr_number + + health report + ✓ health with no data exits 0 + ✓ health shows no data message + ✓ health shows statistics + ✓ health shows success rate + ✓ health shows regressions + + integrated post-merge workflow + ✓ workflow: monitoring complete + ✓ workflow: regression detection valid + ✓ workflow: correlation valid + ✓ workflow: outcome recorded + + + + ────────────────────────────────────────── + + All 55 tests passed + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright fix test — Unit Tests ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up test environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-fix-test.Uzr8BE + + ▸ Help output contains expected sections... ✓ + ▸ Help via --help flag... ✓ + ▸ Missing goal shows error... ✓ + ▸ Missing repos shows error... ✓ + ▸ Arg parsing — --repos comma-separated... ✓ + ▸ Arg parsing — --repos-from file... ✓ + ▸ Arg parsing — --repos-from missing file... ✓ + ▸ Arg parsing — --pipeline template... ✓ + ▸ Arg parsing — --max-parallel... ✓ + ▸ Arg parsing — --branch-prefix... ✓ + ▸ Dry run shows what would happen... ✓ + ▸ Dry run does not create state file... ✓ + ▸ Fix status — no sessions... ✓ + ▸ Fix status shows existing sessions... ✓ + ▸ Invalid repo directory shows error... ✓ + ▸ Fix start creates state file... ✓ + ▸ Fix start emits events... ✓ + ▸ Fix start — summary output... ✓ + ▸ Branch name sanitization... ✓ + ▸ Fix header shows configuration... ✓ + ▸ Non-git repo warning... ✓ + ▸ Per-repo event tracking... ✓ + +━━━ Results ━━━ + Passed: 22 + Failed: 0 + Total: 22 + +All 22 tests passed! + + + + shipwright fleet-discover test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + +Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ source guard uses if/then/fi pattern + +Version + ✓ VERSION variable defined + +Help Output + ✓ help contains USAGE + ✓ help contains --org option + ✓ help contains --language option + ✓ help contains --dry-run option + ✓ help contains --json option + ✓ help contains --topic option + ✓ help contains --exclude-topic option + ✓ help contains --min-activity-days + +Argument Validation + ✓ missing --org exits non-zero + ✓ missing --org error message + +Error Handling + ✓ unknown option exits non-zero + ✓ unknown option error + +NO_GITHUB Check + ✓ NO_GITHUB blocks discovery + ✓ NO_GITHUB shows error + +Argument Parsing + ✓ supports --org=value syntax + ✓ supports --config=value syntax + ✓ supports --language=value syntax + ✓ supports --topic=value syntax + +Config Merge + ✓ merge added new repo + ✓ merge preserved existing repo + +Filter Logic + ✓ filters archived repos + ✓ filters disabled repos + ✓ checks has_issues + ✓ language filter applied + ✓ topic filter applied + ✓ exclude topic filter + ✓ checks .shipwright-ignore + +Event Emission + ✓ emits fleet.discover.completed event + ✓ emits fleet.discover.merged event + + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright fleet test — Unit Tests ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up test environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-fleet-test.H3G1EX + + ▸ Help output contains expected sections... ✓ + ▸ Help via --help flag... ✓ + ▸ Config parsing — valid config... ✓ + ▸ Config parsing — missing config file... ✓ + ▸ Config parsing — invalid JSON... ✓ + ▸ Config parsing — empty repos array... ✓ + ▸ Config defaults applied... ✓ + ▸ Fleet init generates config template... ✓ + ▸ Fleet init skips when config exists... ✓ + ▸ Fleet start spawns tmux sessions per repo... ✓ + ▸ Fleet start skips missing repos... ✓ + ▸ Fleet start skips existing sessions... ✓ + ▸ Fleet start creates fleet state file... ✓ + ▸ Fleet start emits fleet.started event... ✓ + ▸ Fleet start applies repo-level overrides... ✓ + ▸ Fleet stop kills sessions and cleans state... ✓ + ▸ Fleet stop — no fleet running... ✓ + ▸ Fleet stop emits fleet.stopped event... ✓ + ▸ Fleet status — no fleet running... ✓ + ▸ Fleet status shows dashboard... ✓ + ▸ Fleet metrics — no events file... ✓ + ▸ Fleet metrics dashboard output... ✓ + ▸ Fleet metrics JSON output... ✓ + ▸ Fleet metrics period flag... ✓ + ▸ Session name generation... ✓ + ▸ Fleet start skips non-git repos... ✓ + ▸ Unknown subcommand... ✓ + +━━━ Results ━━━ + Passed: 27 + Failed: 0 + Total: 27 + +All 27 tests passed! + + + + shipwright fleet-viz test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + +Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ source guard uses if/then/fi pattern + +Version + ✓ VERSION variable defined + +Help Output + ✓ help contains USAGE + ✓ help contains overview subcommand + ✓ help contains workers subcommand + ✓ help contains insights subcommand + ✓ help contains queue subcommand + ✓ help contains costs subcommand + ✓ help contains export subcommand + ✓ --help flag works + +Error Handling + ✓ unknown command exits non-zero + ✓ unknown command error message + +Overview Subcommand + ✓ overview shows Fleet Overview + ✓ overview shows Active count + ✓ overview shows Queued count + ✓ overview shows Repos count + +Workers Subcommand + ✓ workers shows Worker Allocation + ✓ workers shows Remote Machines + +Queue Subcommand + ✓ queue shows Issue Queue + ✓ queue shows queued items + +Costs Subcommand + ✓ costs shows Fleet Costs + ✓ costs shows Total Spend + ✓ costs shows Per-Repo + ✓ costs shows Per-Model + +Export Subcommand + ✓ export produces JSON with active_jobs + +Insights Subcommand + ✓ insights shows Fleet Insights + ✓ insights shows Success Rate + +Default Command + ✓ default command shows Fleet Overview + +Health Helpers + ✓ get_health_status function defined + ✓ color_health function defined + ✓ healthy status handled + ✓ degraded status handled + ✓ failing status handled + + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright frontier — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Adversarial Review + ▸ Adversarial review produces structured findings... ✓ + ▸ Adversarial iteration converges on no critical findings... ✓ + +Developer Simulation + ▸ Simulation generates objections from 3 personas... ✓ + ▸ Simulation address returns action items... ✓ + +Architecture Enforcer + ▸ Architecture model has valid schema... ✓ + ▸ Architecture validates changes (violation detected)... ✓ + +Graceful Degradation + ▸ All three degrade gracefully when claude unavailable... ✓ + +════════════════════════════════════════════════════ + All 7 tests passed ✓ +════════════════════════════════════════════════════ + + + + Shipwright GitHub App Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help + ✓ help shows usage + ✓ help shows commands + ✓ --help flag works + ✓ unknown command shows error + Status + ✓ status without config warns + Events + ✓ events with no log warns + Manifest + ✓ manifest contains app name + ✓ manifest contains webhook URL + ✓ manifest success message + ✓ manifest output is valid JSON + Configured Status + ✓ configured status shows app ID + ✓ configured status shows install ID + ✓ events shows recent events + Token + ✓ token without key file errors + Verify + ✓ verify without secret errors + + + + ────────────────────────────────────────── + + All 15 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright github-checks — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Checks API Availability + ▸ _gh_checks_available: returns true when API accessible... {"check_runs":[],"total_count":0} +✓ + ▸ _gh_checks_available: returns false on 403... ✓ + +Check Run CRUD + ▸ gh_checks_create_run: returns run ID from response... ✓ + ▸ gh_checks_create_run: handles 403 gracefully... ⚠ Failed to create check run 'test-check' (API returned 1) +ERROR: scripts/sw-github-checks-test.sh:224 exited with status 1 +✓ + ▸ gh_checks_update_run: sends correct PATCH request... ✓ + ▸ gh_checks_update_run: skips when run_id empty... ✓ + ▸ gh_checks_annotate: respects 50-annotation limit... ✓ + ▸ gh_checks_list_runs: parses response correctly... ✓ + ▸ gh_checks_complete: convenience wrapper works... ✓ + +Pipeline Integration + ▸ gh_checks_pipeline_start: creates runs for all stages... ✓ + ▸ gh_checks_stage_update: looks up stored run IDs... ✓ + +NO_GITHUB Guard + ▸ NO_GITHUB: all functions return early... ✓ + +════════════════════════════════════════════════════ + All 12 tests passed ✓ +════════════════════════════════════════════════════ + + +╔═══════════════════════════════════════════════════╗ +║ shipwright github-deploy — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Deployment CRUD + ▸ gh_deploy_create: returns deployment ID... ✓ + ▸ gh_deploy_create: handles 403 gracefully... ⚠ Failed to create deployment for ref 'main' to 'production' (API returned 1) +ERROR: scripts/sw-github-deploy-test.sh:158 exited with status 1 +✓ + ▸ gh_deploy_update_status: sends correct POST... ✓ + ▸ gh_deploy_update_status: skips when deploy_id empty... ✓ + ▸ gh_deploy_list: parses deployment list... ✓ + ▸ gh_deploy_latest: returns first result... ✓ + +Rollback + ▸ gh_deploy_rollback: creates new deployment with prev ref... ▸ Rolling back to ref: v2.9 +✓ Rolled back to v2.9 (deployment 301) +✓ + +Pipeline Integration + ▸ gh_deploy_pipeline_start: stores deployment ID... ✓ + ▸ gh_deploy_pipeline_complete: updates status correctly... ✓ + +NO_GITHUB Guard + ▸ NO_GITHUB: all functions return early... ✓ + +════════════════════════════════════════════════════ + All 10 tests passed ✓ +════════════════════════════════════════════════════ + + +━━━ shipwright github-graphql tests ━━━ + + ▸ Fresh cache returns cached data... ✓ + ▸ Stale cache returns miss... ✓ + ▸ Atomic writes leave no temp files... ✓ + ▸ gh_file_change_frequency returns count... ✓ + ▸ gh_contributors returns parsed list... ✓ + ▸ gh_similar_issues truncates long text... ✓ + ▸ gh_branch_protection handles 404... ✓ + ▸ gh_security_alerts handles 403... ✓ + ▸ gh_dependabot_alerts handles 403... ✓ + ▸ gh_repo_context aggregates data... ✓ + ▸ NO_GITHUB returns defaults... ✓ + ▸ _gh_detect_repo parses SSH URL... ✓ + ▸ _gh_detect_repo parses HTTPS URL... ✓ + ▸ _gh_detect_repo parses HTTPS without .git... ✓ + ▸ gh_codeowners parses CODEOWNERS file... ✓ + ▸ Cache clear removes all files... ✓ + ▸ gh_commit_history returns parsed commits... ✓ + ▸ gh_blame_data aggregates authors... ✓ + ▸ gh_actions_runs calculates duration... ✓ + ▸ Events emitted for cache hit/miss... ✓ + +━━━ Results ━━━ + Passed: 20 + Failed: 0 + Total: 20 + +All 20 tests passed! + + + + Shipwright Guild Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ no args shows help + ✓ list shows Available Guilds + ✓ guild config.json created + ✓ guild config is valid JSON + ✓ guilds.json data file created + ✓ show security guild + ✓ show invalid guild exits nonzero + ✓ show without name exits nonzero + ✓ add pattern succeeds + ✓ pattern saved in data file + ✓ report shows guild data + ✓ report for specific guild + ✓ inject security shows knowledge + ✓ unknown command exits nonzero + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright heartbeat + checkpoint — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Heartbeat Lifecycle + ▸ Write heartbeat creates JSON file... ✓ + ▸ Check heartbeat reports alive... ✓ Job test-job-alive alive (0s ago) +✓ + ▸ Check heartbeat reports stale... ⚠ Job test-job-stale stale (195218505s ago, timeout: 120s) +✓ + ▸ Clear heartbeat removes file... ✓ Cleared heartbeat for job: test-job-clear +✓ + ▸ List heartbeats returns JSON array... ✓ + ▸ Heartbeat update overwrites existing... ✓ + ▸ Check missing heartbeat returns error... ✓ + ▸ Heartbeat dir auto-created when missing... ✓ + +Checkpoint Lifecycle + ▸ Checkpoint save creates JSON file... ✓ Checkpoint saved for stage build (iteration 5) +✓ + ▸ Checkpoint restore outputs JSON... ✓ Checkpoint saved for stage test (iteration 3) +✓ + ▸ Checkpoint restore missing stage fails... ✓ + ▸ Checkpoint clear removes file... ✓ Checkpoint saved for stage review (iteration 1) +✓ Cleared checkpoint for stage review +✓ + ▸ Checkpoint clear --all removes all... ✓ Checkpoint saved for stage build (iteration 1) +✓ Checkpoint saved for stage test (iteration 2) +✓ Cleared 2 checkpoint(s) +✓ + ▸ Checkpoint save with files-modified... ✓ Checkpoint saved for stage build (iteration 7) +✓ + +Integration + ▸ Pipeline script has heartbeat functions... ✓ + ▸ Loop script has heartbeat and checkpoint... ✓ + ▸ Pipeline has human intervention checks... ✓ + +════════════════════════════════════════════════════ + All 17 tests passed ✓ +════════════════════════════════════════════════════ + +sw-hello-test.sh + ✓ hello command outputs 'hello world' + ✓ hello command exits with code 0 + ✓ hello --help displays help text + ✓ hello -h displays help text + ✓ hello --version displays version + ✓ hello with invalid option exits with code 1 + +PASS: 6 +FAIL: 0 + + + Shipwright Hygiene Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows subcommands + ✓ --help exits 0 + + error handling + ✓ unknown subcommand exits 1 + ✓ unknown subcommand shows error + + report subcommand + ✓ report exits 0 + ✓ report shows generating + ✓ report exits 0 + ✓ report creates JSON file + ✓ report JSON is valid + ✓ report JSON has timestamp + ✓ report JSON has sections + + structure subcommand + ✓ structure exits 0 + ✓ structure reports validating + + naming subcommand + ✓ naming exits 0 + ✓ naming shows checking + + dead-code subcommand + ✓ dead-code exits 0 + ✓ dead-code shows scanning + + dependencies subcommand + ✓ dependencies exits 0 + ✓ dependencies shows auditing + + platform-refactor subcommand + ✓ platform-refactor exits 0 + ✓ platform-refactor scans for hardcoded/fallback + ✓ platform-refactor creates platform-hygiene.json with counts + + policy read (policy_get from config) + ✓ policy_get returns value from config + ✓ policy_get returns default when key missing + + + + ────────────────────────────────────────── + + All 26 tests passed + + + + Shipwright Incident Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help / version + ✓ help exits 0 + ✓ help shows USAGE + ✓ help mentions watch + ✓ help mentions list + ✓ help mentions report + ✓ help mentions stats + ✓ VERSION variable defined + + error handling + ✓ Unknown command exits non-zero + + list command + ✓ list with no incidents exits 0 + + stats command + ✓ stats with no data exits 0 + + state management + ✓ Incident config created + ✓ Incident config is valid JSON + + script safety + ✓ Uses set -euo pipefail + ✓ Has source guard pattern + ✓ ERR trap is set + + config subcommand + ✓ config subcommand creates valid JSON config + + subcommand usage +ERROR: scripts/sw-incident-test.sh:203 exited with status 1 + ✓ show subcommand fails or shows usage when missing args + ✓ report subcommand shows usage when missing args + + detect_pipeline_failures + ✓ detect_pipeline_failures defined and callable + ✓ report with nonexistent incident handles gracefully + ✓ gap list subcommand produces expected output + + timeline updates + ✓ Timeline update creates valid JSON + ✓ Timeline entry has all required fields + + correlation engine + ✓ Correlation engine produces valid JSON + + escalation logic + ✓ Escalation includes P0 rules + ✓ Escalation function defined + + rollback verification + ✓ Rollback verification function defined + ✓ Rollback verification updates timeline + + auto-remediate subcommand + ✓ auto-remediate subcommand shows usage when missing args + + deep analysis + ✓ Deep analysis function defined + ✓ Deep analysis returns valid JSON with required fields + ✓ Config includes auto_remediate_enabled flag + ✓ Config includes escalation rules for all severities + + post-mortem report + ✓ Post-mortem report includes Timeline section + ✓ Post-mortem report references timeline fields + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright init — E2E Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up sandboxed environment... +Temp dir: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-init-test.rLV8jz + +Configuration + ▸ Settings.json created with agent teams... ✓ + ▸ Settings merge preserves existing vars... ✓ + ▸ tmux.conf installed... ✓ + ▸ Overlay installed... ✓ + +Templates + ▸ Team templates installed (>= 10)... ✓ + ▸ Pipeline templates installed (>= 5)... ✓ + ▸ Legacy templates path populated... ✓ + +Robustness + ▸ Idempotency — double init safe... ✓ + ▸ Doctor runs at end... ✓ + ▸ Help flag... ✓ + +Hook Wiring + ▸ JSONC stripped from settings.json... ✓ + ▸ Hooks wired into settings.json... ✓ + ▸ Hook wiring preserves existing hooks... ✓ + ▸ SessionStart hook installed... ✓ + ▸ Hook wiring with pre-existing settings... ✓ + +Repair & Cleanup + ▸ Legacy overlay cleanup... ✓ + ▸ Legacy overlay source-file reference stripped... ✓ + ▸ Repair mode forces clean reinstall... ✓ + ▸ Plugin direct-clone fallback (outside tmux)... ✓ + ▸ Post-install verification... ✓ + ▸ tmux adapter deployed... ✓ + +════════════════════════════════════════════════════ + All 21 tests passed ✓ +════════════════════════════════════════════════════ + + + + Shipwright Instrument Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage + ✓ help shows commands + ✓ help shows start + ✓ help shows record + ✓ help shows version + + Start Command + ✓ start without --run-id exits non-zero + ✓ start without --run-id shows error + ✓ start with --run-id exits 0 + ✓ start confirms run ID + ✓ start creates run file + ✓ run file contains correct run_id + ✓ run file contains correct issue + + Record Command + ✓ record exits 0 + ✓ record confirms metric + ✓ run file has 1 metric + ✓ record without all args exits non-zero + ✓ record on missing run exits non-zero + ✓ record on missing run shows error + + Stage Start/End + ✓ stage-start exits 0 + ✓ stage-start confirms stage + ✓ stage-end exits 0 + ✓ stage result recorded + + Finish Command + ✓ finish exits 0 + ✓ finish confirms completion + ✓ finish removes active run file + ✓ finish writes to completed JSONL + + Trends & Export + ✓ trends with no data exits 0 + ✓ trends with no data warns + ✓ export with no data exits 0 + ✓ export with no data warns + + Error Handling + ✓ unknown command exits non-zero + ✓ unknown command shows error + ✓ events.jsonl created from instrument operations + + + + ────────────────────────────────────────── + + All 33 tests passed + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright intelligence test — Unit Tests ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up test environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-intelligence-test.m6i2Hq + + ▸ analyze_issue returns valid schema... ✓ + ▸ Cache hit on second call with same input... WARN: Unknown event type 'intelligence.cache_hit' — update config/event-schema.json +✓ + ▸ Graceful degradation when claude CLI unavailable... ✓ + ▸ compose_pipeline produces valid pipeline JSON... ✓ + ▸ recommend_model returns valid model names... WARN: Unknown event type 'intelligence.model' — update config/event-schema.json +WARN: Unknown event type 'intelligence.model' — update config/event-schema.json +WARN: Unknown event type 'intelligence.model' — update config/event-schema.json +WARN: Unknown event type 'intelligence.model' — update config/event-schema.json +✓ + ▸ predict_cost returns numeric estimates... WARN: Unknown event type 'intelligence.prediction' — update config/event-schema.json +✓ + ▸ Cache TTL expiry returns miss... scripts/sw-intelligence-test.sh: line 334: _intelligence_md5: command not found +✓ + ▸ search_memory returns ranked results... ✓ + ▸ Feature flag disabled returns fallback... ✗ Expected 'intelligence_disabled', got 'null' (returns intelligence_disabled error) +✓ + ▸ Events emitted for analysis... ✓ + ▸ recommend_model emits events... WARN: Unknown event type 'intelligence.model' — update config/event-schema.json +✓ + ▸ Cache init creates file if missing... ✓ + +━━━ Results ━━━ + Passed: 12 + Failed: 0 + Total: 12 + +All 12 tests passed! + + + + Shipwright Jira Test Suite + ══════════════════════════════════════════ + + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ VERSION variable defined + ✓ VERSION is semver: 3.2.4 + + Help Output + ✓ help mentions USAGE + ✓ help mentions sync command + ✓ help mentions update command + ✓ help mentions status command + ✓ help mentions init command + ✓ help mentions JIRA_BASE_URL + ✓ --help works + + Error Handling + +shipwright jira — Jira ↔ GitHub Bidirectional Sync + +USAGE + shipwright jira [options] + +COMMANDS + sync [--dry-run] Sync Jira To Do issues → GitHub + update Update linked Jira ticket status + status Show Jira board dashboard + init Configure Jira connection + help Show this help + +STATUS VALUES + started Pipeline spawned → Jira: In Progress + review PR created → Jira: In Review + done Pipeline complete → Jira: Done + failed Pipeline failed → Jira: adds failure comment + +EXAMPLES + shipwright jira init # Set up Jira connection + shipwright jira sync # Sync To Do → GitHub + shipwright jira sync --dry-run # Preview what would sync + shipwright jira update 42 started # Mark as In Progress + shipwright jira update 42 review # Mark as In Review + shipwright jira update 42 done # Mark as Done + shipwright jira status # Show board dashboard + +ENVIRONMENT + JIRA_BASE_URL Jira instance URL (or use 'jira init' to save) + JIRA_EMAIL Account email for authentication + JIRA_API_TOKEN API token from Atlassian account + JIRA_PROJECT_KEY Jira project key (e.g. PROJ) + ✓ unknown command exits non-zero + ✓ unknown command shows error + + Default Behavior + ✓ no-arg defaults to help + + Configuration + ✓ sync without config shows error + ✓ update without config shows error + ✓ status without config shows error + + Config Loading + ✓ sync with config proceeds + + Update Subcommand + ✓ update without args shows usage + + Notify Integration + ✓ notify subcommand executes without crash + + Atomic Writes + ✓ init uses atomic write (tmp + mv) + ✓ config file gets restricted permissions + + Event Emission + ✓ emits jira events + ✓ uses EVENTS_FILE for event logging + + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright launchd + systemd — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +OS Detection + ▸ macOS detection sets OSTYPE correctly... ✓ + ▸ Linux detection routes to systemd... ✓ + +macOS Plist Generation + ▸ Daemon plist has correct structure... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Dashboard plist has correct arguments... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Connect plist created only when team-config exists... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/Library/LaunchAgents/com.shipwright.dashboard.plist +▸ Skipping connect plist — /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/.shipwright/team-config.json not found +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/fresh-home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Plist files have correct permissions... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + +Install Command + ▸ Install creates LaunchAgents directory... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Install calls launchctl load... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +scripts/sw-launchd-test.sh: line 457: [[: 0 +0: syntax error in expression (error token is "0") +✓ + ▸ Install fails gracefully if sw binary not found... ✓ + +Uninstall Command + ▸ Uninstall removes plist files... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +▸ Uninstalling launchd agents... +Mock unloaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Unloaded daemon service +✓ Removed daemon plist +Mock unloaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Unloaded dashboard service +✓ Removed dashboard plist +Mock unloaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Unloaded connect service +✓ Removed connect plist + +✓ Uninstalled all launchd agents +✓ + ▸ Uninstall calls launchctl unload... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +▸ Uninstalling launchd agents... +Mock unloaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Unloaded daemon service +✓ Removed daemon plist +Mock unloaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Unloaded dashboard service +✓ Removed dashboard plist +Mock unloaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Unloaded connect service +✓ Removed connect plist + +✓ Uninstalled all launchd agents +scripts/sw-launchd-test.sh: line 549: [[: 0 +0: syntax error in expression (error token is "0") +✓ + ▸ Uninstall on empty system doesn't error... ▸ Uninstalling launchd agents... + +✓ Uninstalled all launchd agents +✓ + +Status Command + ▸ Status command checks launchctl list... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Status shows log directory... ✓ + +Help Command + ▸ Help command shows usage and examples... ✓ + ▸ Help is shown for unknown commands... ✓ + +Environment & Configuration + ▸ Plist contains correct environment variables... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Working directory is set in plist... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Plist configures stdout and stderr logging... ▸ Installing launchd agents... +✓ Created daemon plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +⚠ server.ts not found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/sw-launchd-test.2gQX3c/dashboard/server.ts — dashboard plist will reference a missing file +✓ Created dashboard plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Created connect plist: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +▸ Loading launchd services... +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.daemon.plist +✓ Loaded daemon service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.dashboard.plist +✓ Loaded dashboard service +Mock loaded: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/Library/LaunchAgents/com.shipwright.connect.plist +✓ Loaded connect service + +▸ Services will auto-start on next login +▸ View logs: tail -f /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-launchd-test.2gQX3c/home/.shipwright/logs/*.log +▸ Uninstall: shipwright launchd uninstall +✓ + ▸ Version variable is defined... ✓ + +════════════════════════════════════════════════════ + All 20 tests passed ✓ +════════════════════════════════════════════════════ + + + + Shipwright Linear Test Suite + ══════════════════════════════════════════ + + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ VERSION variable defined + ✓ VERSION is semver: 3.2.4 + + Help Output + ✓ help mentions USAGE + ✓ help mentions sync command + ✓ help mentions update command + ✓ help mentions status command + ✓ help mentions init command + ✓ help mentions LINEAR_API_KEY + ✓ --help works + + Error Handling + +shipwright linear — Linear ↔ GitHub Bidirectional Sync + +USAGE + shipwright linear [options] + +COMMANDS + sync [--dry-run] Sync Linear Todo issues → GitHub + update Update linked Linear ticket status + status Show sync dashboard + init Configure Linear API key + help Show this help + +STATUS VALUES + started Pipeline spawned → Linear: In Progress + review PR created → Linear: In Review + done Pipeline complete → Linear: Done + failed Pipeline failed → Linear: adds failure comment + +EXAMPLES + shipwright linear init # Set up API key + shipwright linear sync # Sync Todo → GitHub + shipwright linear sync --dry-run # Preview what would sync + shipwright linear update 42 started # Mark as In Progress + shipwright linear update 42 review # Mark as In Review + shipwright linear update 42 done # Mark as Done + shipwright linear status # Show dashboard + +ENVIRONMENT + LINEAR_API_KEY API key (or use 'linear init' to save) + LINEAR_TEAM_ID Override team ID + LINEAR_PROJECT_ID Override project ID + ✓ unknown command exits non-zero + ✓ unknown command shows error + + Default Behavior + ✓ no-arg defaults to help + + Configuration + ✓ sync without API key shows error + ✓ update without API key shows error + ✓ status without API key shows error + + Config Loading + ✓ sync with config proceeds + + Update Subcommand + ✓ update without args shows usage + + Notify Integration + ✓ notify subcommand executes without crash + + Atomic Writes + ✓ init uses atomic write (tmp + mv) + ✓ config file gets restricted permissions + + GraphQL Helper + ✓ linear_graphql helper defined + ✓ uses jq --arg for safe JSON escaping + + Event Emission + ✓ emits linear events + ✓ uses EVENTS_FILE for event logging + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + shipwright logs test + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + +Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + +Version + ✓ VERSION variable defined + +Help Output + ✓ help contains USAGE + ✓ help contains --pane option + ✓ help contains --follow option + ✓ help contains --grep option + ✓ help contains --capture option + ✓ help contains -f shorthand + +Help Exit Code + ✓ help exits 0 + ✓ -h exits 0 + +Error Handling + ✓ unknown option exits non-zero + ✓ unknown option error + +List Logs + ✓ list shows Agent Logs heading + ✓ list shows team directory + +Team Logs + ✓ team logs shows team name + ✓ team logs lists log files + ✓ team logs shows builder log + +Grep Search + ✓ grep finds ERROR pattern + ✓ grep shows file context + ✓ grep shows no matches warning + +Pane Filter + ✓ pane filter shows reviewer logs + ✓ nonexistent pane warns + +Capture Command + ✓ capture reports status + +Missing Arguments + ✓ missing --pane value exits non-zero + ✓ missing --pane shows error + ✓ missing --grep value exits non-zero + ✓ missing --grep shows error + +Intelligence Integration + ✓ intelligence_available function defined + ✓ semantic_rank_results function defined + +Script Structure + ✓ LOGS_DIR defined + ✓ capture_logs function defined + ✓ list_logs function defined + ✓ show_team_logs function defined + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Loop Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help / version + ✓ --help exits 0 + ✓ --help shows usage + ✓ --help shows options + ✓ --help mentions --max-iterations + ✓ --help mentions --test-cmd + ✓ --help mentions --model + ✓ --help mentions --agents + ✓ --help mentions --resume + ✓ VERSION variable defined in sw-loop.sh + + argument parsing + ✓ No arguments exits non-zero + + script safety + ✓ Uses set -euo pipefail + ✓ ERR trap is set + ✓ SIGHUP trap set for daemon resilience + ✓ CLAUDECODE env var is unset + + defaults + ✓ Default MAX_ITERATIONS is 20 + ✓ Default AGENTS is 1 + ✓ Default MAX_RESTARTS is 0 + ✓ Sources lib/compat.sh + + json output format + ✓ build_claude_flags includes --output-format json + effort level flag + ✓ build_claude_flags supports --effort + fallback model flag + ✓ build_claude_flags supports --fallback-model + ✓ accumulate_loop_tokens parses JSON usage + ✓ LOOP_COST_MILLICENTS initialized + ✓ write_loop_tokens includes cost_usd + ✓ _extract_text_from_json helper defined + ✓ validate_claude_output helper defined + ✓ check_budget_gate helper defined + ✓ run_claude_iteration separates stdout from stderr + + json extraction robustness + ✓ _extract_text_from_json handles empty file + ✓ _extract_text_from_json extracts .result from JSON + ✓ _extract_text_from_json passes through plain text + + default config from source + ✓ Default MAX_ITERATIONS is 20 (from source) + ✓ Default AGENTS is 1 (from source) + ✓ Default MAX_RESTARTS is 0 (from source) + + json extraction edge cases + ✓ _extract_text_from_json handles nested JSON objects + ✓ _extract_text_from_json handles binary garbage without crash + + script structure + ✓ Script has circuit breaker logic + ✓ Script has stuckness detection + ✓ Script has test/quality gate functions + ✓ Help text defines --model and --agents flags + ✓ Help text defines --test-cmd and --resume flags + help mentions --effort + ✓ Help text defines --effort flag + help mentions --fallback-model + ✓ Help text defines --fallback-model flag + + loop behavior: LOOP_COMPLETE + ✓ Loop detected completion signal + + loop behavior: iterations on test failure + ✓ Loop runs multiple iterations when tests fail initially + + loop behavior: max iterations + ✓ Loop stops at max iterations + + loop behavior: stuckness detection + ✓ Loop stops at limit (stuckness test) + + loop behavior: budget gate + ✓ Budget gate stops loop + + validate_claude_output + ✓ validate_claude_output catches corrupt output + + loop behavior: progress tracking + ✓ Loop tracks progress via git + + context efficiency metrics + ✓ loop.context_efficiency event exists in run_claude_iteration + ✓ Context efficiency emits raw and trimmed char counts + ✓ Context efficiency emits trim_ratio and budget_utilization + ✓ raw_prompt_chars measured from pre-trim prompt + + multi-test gate + ✓ ADDITIONAL_TEST_CMDS variable defined + ✓ --additional-test-cmds flag in arg parser + ✓ --help documents --additional-test-cmds + ✓ run_test_gate writes test-evidence JSON + ✓ run_audit_agent reads structured test evidence + + verification gap handler + ✓ Verification gap detection present + ✓ Verification gap resolved event emitted + ✓ Verification gap confirmed event emitted + ✓ Verification gap can override audit result + ✓ Verification re-runs tests to dedicated log + ✓ Mid-build test file discovery integrated + + + + ────────────────────────────────────────── + + All 65 tests passed + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright memory+cost test — Unit Tests for Memory & Cost ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up mock environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-memory-test.Ym1tNo + + ▸ Memory capture from pipeline state... ✓ + ▸ Memory inject returns context for each stage... ✓ + ▸ Failure capture stores patterns... ✓ + ▸ Pattern detection identifies project type... ✓ + ▸ Cross-repo vs per-repo isolation... ✓ + ▸ Memory show displays dashboard... ✓ + ▸ Memory search finds matching entries... ✓ + ▸ Memory export produces valid JSON... ✓ + ▸ Memory forget clears repo memory... ✓ + ▸ Cost calculation for each model... ✓ + ▸ Cost recording writes to costs.json... ✓ + ▸ Budget set and check... ✓ + ▸ Cost dashboard runs without errors... ✓ + ▸ Cost JSON output is valid... ✓ + ▸ Actionable failures threshold filtering... ✓ + ▸ Actionable failures with no file returns []... ✓ + ▸ DORA baseline calculation from events... ✓ + ▸ Error log entries captured into failures.json... ✓ + ▸ Fix outcome tracking increments counters... ✓ + ▸ Closed-loop inject returns formatted fix... ✓ + ▸ Global aggregation promotes frequent patterns... ✓ + ▸ Finalize pipeline runs capture + aggregate... ▸ Capturing pipeline learnings for test-org/test-repo... +✓ Captured pipeline learnings (status: complete) +✓ + +━━━ Results ━━━ + Passed: 22 + Failed: 0 + Total: 22 + +All 22 tests passed! + + + + Shipwright Mission Control Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Navigation + ✓ help shows usage + ✓ help shows commands + ✓ --help flag works + ✓ unknown command shows error + Overview + ✓ overview shows MISSION CONTROL header + ✓ overview shows Summary Statistics + ✓ overview shows Active Pipelines + ✓ overview with active job shows count + Agent Tree + ✓ agents shows hierarchy + ✓ agents shows Pipeline Agent + Resources + ✓ resources shows utilization + Alerts + ✓ alerts shows alert feed + Stage Commands + ✓ pause without id exits nonzero + ✓ pause emits success + ✓ resume without id exits nonzero + ✓ resume emits success + ✓ skip without stage shows usage + ✓ skip emits success + ✓ retry emits success + + + + ────────────────────────────────────────── + + All 19 tests passed + + + + Shipwright Model Router Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage + ✓ help shows route + ✓ help shows escalate + ✓ help shows config + + Route Model + ✓ route intake at 50 = haiku + ✓ route build at 50 = opus + ✓ route test at 50 = sonnet + ✓ route build at 10 (low) = sonnet + ✓ route intake at 90 (high) = opus + ✓ route unknown stage at 50 = sonnet + + Escalate Model + ✓ escalate haiku -> sonnet + ✓ escalate sonnet -> opus + ✓ escalate opus -> opus (ceiling) + ✓ escalate unknown exits non-zero + + Config + ✓ config show displays JSON + ✓ config creates default file + ✓ config set confirms update + ✓ config set persists value + + Estimate + ✓ estimate shows stages + ✓ estimate shows total + + Report + ✓ report with no data warns + + Record Usage + ✓ record_usage creates usage file + ✓ record_usage writes entries + ✓ report with data shows summary + ✓ report shows total runs + ✓ report shows cost + ✓ report shows model counts + + Route All Stages & Complexity + ✓ route intake at 50 returns model + ✓ route plan at 50 returns model + ✓ route design at 50 returns model + ✓ route build at 50 returns model + ✓ route test at 50 returns model + ✓ route review at 50 returns model + ✓ route compound_quality at 50 returns model + ✓ route validate at 50 returns model + ✓ route monitor at 50 returns model + ✓ route plan at low complexity = sonnet + ✓ route plan at high complexity = opus + + Config Set/Show Cycle +✓ Updated cost_aware_mode = false + ✓ config show reflects settings + ✓ config set persists + ✓ estimate with low complexity shows stages + ✓ estimate shows Total + ✓ estimate with high complexity + + Error Handling + ✓ unknown subcommand exits non-zero + ✓ unknown subcommand shows error + + Reasoning Chains + ✓ chain config shows templates + ✓ chain config shows explore-synthesize-decide + ✓ chain config shows fast-verify + ✓ chain config shows deep-analysis + ✓ chain config creates templates file + ✓ chain define shows success + ✓ chain define persists custom chain + ✓ chain_score_confidence returns numeric score + ✓ chain_score_confidence scores conclusion text higher + ✓ chain execute returns valid JSON + ✓ chain execute result has steps + ✓ chain step-cost returns numeric cost for haiku + ✓ chain step-cost ordering correct (haiku < sonnet < opus) + ✓ chain report outputs summary + ✓ chain define with invalid JSON exits non-zero + ✓ chain define validates JSON + ✓ explore-decide has 2 steps + ✓ explore-decide first step is haiku + ✓ explore-decide last step is opus + ✓ chain execute with invalid chain exits non-zero + ✓ chain execute shows error + + + + ────────────────────────────────────────── + + All 66 tests passed + + + + Shipwright OTel Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage + ✓ help shows metrics + ✓ help shows trace + ✓ help shows export + ✓ help shows webhook + ✓ help shows dashboard + + Metrics + ✓ metrics text has pipelines_total + ✓ metrics text has active_pipelines + ✓ metrics text has cost + ✓ metrics text has queue depth + ✓ metrics json has metrics key + ✓ metrics json has pipelines_total + ✓ metrics json is valid JSON + + Trace + ✓ trace has resourceSpans + ✓ trace has service.name + ✓ trace output is valid JSON + + Dashboard + ✓ dashboard has dashboard key + ✓ dashboard output is valid JSON + + Report + ✓ report shows header + ✓ report shows events section + ✓ report shows pipeline metrics + ✓ report shows recommendations + + Webhook + ✓ webhook without URL exits non-zero + ✓ webhook without URL shows error + + Metrics With Events + ✓ metrics count total pipelines = 2 + + Error Handling + ✓ unknown command exits non-zero + ✓ unknown command shows error + + + + ────────────────────────────────────────── + + All 27 tests passed + + + + Shipwright Oversight Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ unknown command exits 1 + ✓ members shows board + ✓ members.json created + ✓ members.json is valid JSON + ✓ config show works + ✓ config.json created + ✓ stats shows statistics + ✓ history handles empty + ✓ review without args exits nonzero + ✓ review submission accepted + ✓ review JSON file created + ✓ review file is valid JSON + ✓ stats shows total reviews >= 1 + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Patrol Meta Test Suite + ══════════════════════════════════════════ + + + Script Structure + ✓ contains sourced-file note + ✓ does NOT have set -euo pipefail as code (sourced script) + ✓ no main() function (sourced script) + + Function Definitions + ✓ patrol_meta_run() defined + ✓ patrol_meta_create_issue() defined + ✓ patrol_meta_untested_scripts() defined + ✓ patrol_meta_bash_compat() defined + ✓ patrol_meta_version_sync() defined + ✓ patrol_meta_dora_trends() defined + ✓ patrol_meta_template_effectiveness() defined + ✓ patrol_meta_memory_pruning() defined + ✓ patrol_meta_event_analysis() defined + + Sourcing + ✓ script can be sourced without error + + NO_GITHUB Dry Run + ✓ checks NO_GITHUB for dry-run mode + ✓ supports dry-run output + + Bash Compat Checks + ✓ bash compat check looks for declare -A + ✓ bash compat check looks for readarray/mapfile + + Dedup Logic + ✓ dedup logic skips duplicate issues + ✓ dedup searches existing issues + + Memory Pruning Check + ✓ memory pruning uses du -sk for size check + ✓ memory pruning has MB threshold + + Event Analysis + ✓ event analysis uses 7-day window + ✓ event analysis checks pipeline.completed events + + Closed-Loop Self-Improvement Functions + ✓ patrol_meta_score_impact() defined + ✓ patrol_meta_execute_fix() defined + ✓ patrol_meta_verify_fix() defined + ✓ patrol_meta_rollback() defined + ✓ patrol_meta_learn() defined + ✓ patrol_meta_batch_improve() defined + ✓ patrol_meta_auto() defined + + Impact Scoring + ✓ untested-scripts category gets high score (75) + ✓ dora-regression category gets critical score (85) + + Pipeline Integration + ✓ execute_fix calls shipwright pipeline start + ✓ uses --worktree for isolated execution + + Learning & Memory + ✓ records self-improvements in memory + ✓ emits patrol.meta_learned events + + Batch Processing + ✓ batch_improve looks for meta-improvement labeled issues + ✓ sorts issues by impact score (highest first) + + Dry-Run Mode + ✓ new functions respect NO_GITHUB for dry-run + + Autonomous Loop + ✓ patrol_meta_auto() orchestrates full loop + ✓ auto loop has stage 1 (detection) and stage 2 (batch) + + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright pipeline composer — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Pipeline Composition + ▸ Composed pipeline has valid stage ordering... ✓ + ▸ High-risk issue gets security stages... .claude/pipeline-artifacts/composed-pipeline.json +✓ + ▸ Fallback to static template when no intelligence... ✓ + +Conditional Stage Insertion + ▸ Stage inserted at correct position after build... ✓ + ▸ Insert into nonexistent stage fails... ✓ + +Model Downgrade + ▸ Budget constraint triggers model downgrades... ✓ + ▸ Downgrade with nonexistent stage fails... ✓ + +Pipeline Validation + ▸ Validation accepts valid pipeline... ✓ + ▸ Validation rejects invalid ordering (test before build)... ✓ + ▸ Validation rejects missing stage ids... ✓ + ▸ Validation rejects missing stages array... ✓ + +Iteration Estimation + ▸ Iteration estimates are reasonable (1-50 range)... ✓ + +════════════════════════════════════════════════════ + All 12 tests passed ✓ +════════════════════════════════════════════════════ + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright pipeline test — E2E Validation (Real Subprocess) ║ +╚═══════════════════════════════════════════════════════════════════╝ Setting up mock environment... -✓ Environment ready: /tmp/sw-pipeline-test.76QxGO +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-pipeline-test.B3iiel + + ▸ Preflight passes with all mocks... ✓ + ▸ Preflight fails when sw-loop.sh missing... ✓ + ▸ Start requires --goal or --issue... ✓ + ▸ Intake with --goal creates branch + artifacts... ✓ + ▸ Intake with --issue fetches from GitHub... ✓ + ▸ Plan generates plan.md, dod.md, tasks... ✓ + ▸ Build invokes sw loop and commits... ✓ + ▸ Test stage captures results to log... ✓ + ▸ Review generates report with severities... ✓ + ▸ PR stage creates PR URL artifact... ✓ + ▸ Full E2E pipeline (6 stages)... ✓ + ▸ Resume continues from partial state... ✓ + ▸ Abort marks pipeline as aborted... ✓ + ▸ Dry run shows config, no artifacts... ✓ + ▸ Self-healing build→test retry loop... ✓ + ▸ Intelligence: Skip stages for documentation issues... ✓ + ▸ Intelligence: Skip stages for low complexity... ✓ + ▸ Intelligence: Finding classification and routing... ✓ + ▸ Intelligence: Mid-pipeline complexity reassessment... ✓ + ▸ Intelligence: Backtracking limit (1 per pipeline)... ✓ + ▸ Cleanup: Post-completion clears checkpoints and transient artifacts... ✓ + ▸ Cleanup: pipeline_cancel_check_runs function exists... ✓ + ▸ Vitals: sw-pipeline-vitals.sh exists and is syntactically valid... ✓ + ▸ Vitals: All vitals functions defined in module... ✓ + ▸ Vitals: Health verdict maps scores correctly... ✓ + ▸ Vitals: Adaptive limit returns valid integer... ✓ + ▸ Vitals: Budget trajectory returns ok/warn/stop... ✓ + ▸ Quality: pipeline_select_audits function exists... ✓ + ▸ Quality: pipeline_security_source_scan function exists... ✓ + ▸ Quality: pipeline_verify_dod function exists... ✓ + ▸ Quality: pipeline_record_quality_score function exists... ✓ + ▸ Quality: Templates have compound_quality_blocking... ✓ + ▸ Vitals: Progress snapshot writes correct file... ✓ + ▸ Vitals: Momentum score from snapshot history... ✓ + ▸ Vitals: Convergence with decreasing errors... ✓ + ▸ Vitals: Configurable weights via env vars... ✓ + ▸ Vitals: Budget trajectory warn/stop on exhaustion... ✓ + ▸ Quality: Structured findings JSON is valid... ✓ + ▸ Quality: Multi-backtrack counter tracking... ✓ + ▸ Quality: 6 categories in classify_quality_findings... ✓ + ▸ Deploy: Pre-deploy gates exist in pipeline... ✓ + ▸ Deploy: Deploy strategy config pattern... ✓ + ▸ Deploy: Canary deploy flow patterns exist... ✓ + ▸ Pipeline: PIPELINE_STATE references removed... ✓ + ▸ Pipeline: Coverage JSON creation in test stage... ✓ + ▸ Pipeline: _pipeline_compact_goal returns goal+plan+design... ✓ + ▸ Pipeline: load_composed_pipeline sets COMPOSED_STAGES... ✓ + ▸ Vitals: Momentum returns 60 for single snapshot past intake... ✓ + ▸ Vitals: Health gate blocks when health < threshold... ✓ + ▸ Vitals: Health gate passes with default threshold=40... ✓ + ▸ Durable: persist_artifacts function exists... ✓ + ▸ Durable: persist_artifacts skips in non-CI mode... ✓ + ▸ Durable: verify_stage_artifacts passes when artifacts present... ✓ + ▸ Durable: verify_stage_artifacts fails when artifacts missing... ✓ + ▸ Durable: verify_stage_artifacts fails when artifacts empty... ✓ + ▸ Durable: verify_stage_artifacts passes for stages with no requirements... ✓ + ▸ Durable: verify_stage_artifacts design requires plan.md... ✓ + ▸ Durable: mark_stage_complete wires persist for plan stage... ✓ + +━━━ Results ━━━ + Passed: 58 + Failed: 0 + Total: 58 + +All 58 tests passed! + + + + Shipwright Pipeline Vitals Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help / version + ✓ --help exits 0 + ✓ --help shows USAGE + ✓ --help shows OPTIONS + ✓ --help mentions --json + ✓ --help mentions --score + ✓ VERSION variable defined + + json output + ✓ --json exits 0 + ✓ --json outputs valid JSON + + score output + ✓ --score outputs valid number (79) + + script safety + ✓ Uses set -euo pipefail + ✓ ERR trap is set + + internals + ✓ WEIGHT_MOMENTUM defined + ✓ WEIGHT_CONVERGENCE defined + + _compute_momentum + ✓ _compute_momentum returns numeric score 0-100 (got: 60) + + _compute_convergence + ✓ _compute_convergence returns numeric score (got: 100) + + pipeline_health_verdict + ✓ pipeline_health_verdict(80) returns valid verdict: continue + ✓ pipeline_health_verdict(55) returns valid verdict: warn + ✓ pipeline_health_verdict(35) returns valid verdict: intervene + ✓ pipeline_health_verdict(15) returns valid verdict: abort + ✓ WEIGHT_MOMENTUM is numeric (35) + ✓ WEIGHT_CONVERGENCE is numeric (30) + ✓ WEIGHT_BUDGET is numeric (20) + ✓ WEIGHT_ERROR_MATURITY is numeric (15) + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright PM Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows subcommands + ✓ --help exits 0 + + analyze subcommand + ✓ analyze_issue returns JSON with issue field + ✓ analyze without issue exits 1 + + team subcommand + ✓ recommend_team returns JSON with roles + + orchestrate subcommand + ✓ orchestrate_stages returns JSON with stages + + recommend combined + ✓ recommend pipeline produces valid team + stages + + history subcommand + ✓ history exits 0 + ✓ history --json exits 0 + ✓ history --json outputs valid JSON + ✓ history --pattern exits 0 + + learn subcommand + ✓ learn exits 0 + ✓ learn confirms recording + ✓ learn without args exits 1 + ✓ learn invalid outcome exits 1 + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + + state file creation + ✓ pm-history.json created + + + + ────────────────────────────────────────── + + All 20 tests passed + + + + Shipwright PR Lifecycle Test Suite + ══════════════════════════════════════════ + + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ VERSION variable defined + ✓ VERSION is semver: 3.2.4 + + Source Guard + ✓ source guard pattern present + ✓ source guard uses if/then/fi (not && pattern) + + Help Output + ✓ help mentions review command + ✓ help mentions merge command + ✓ help mentions cleanup command + ✓ help mentions status command + ✓ help mentions patrol command + ✓ --help works + + Error Handling + +shipwright pr  + +COMMANDS + review  Run review pass on a PR (checks code quality, posts findings) + merge  Attempt auto-merge (checks CI, conflicts, reviews, then merges) + cleanup Close stale PRs (older than configured days, default 14) + status Show all open Shipwright PRs with lifecycle state + patrol Run full PR lifecycle patrol (review + merge + cleanup) + help Show this help + +EXAMPLES + shipwright pr review 42 # Review PR #42 + shipwright pr merge 42 # Try to merge PR #42 + shipwright pr cleanup # Close stale PRs + shipwright pr status # Show all open PRs + shipwright pr patrol # Full lifecycle management + + ✓ unknown command exits non-zero + ✓ unknown command shows error + + Default Behavior + ✓ no-arg defaults to help + + Argument Validation + +shipwright pr  + +COMMANDS + review  Run review pass on a PR (checks code quality, posts findings) + merge  Attempt auto-merge (checks CI, conflicts, reviews, then merges) + cleanup Close stale PRs (older than configured days, default 14) + status Show all open Shipwright PRs with lifecycle state + patrol Run full PR lifecycle patrol (review + merge + cleanup) + help Show this help + +EXAMPLES + shipwright pr review 42 # Review PR #42 + shipwright pr merge 42 # Try to merge PR #42 + shipwright pr cleanup # Close stale PRs + shipwright pr status # Show all open PRs + shipwright pr patrol # Full lifecycle management + + ✓ review without PR number exits non-zero + ✓ review without arg shows error + +shipwright pr  + +COMMANDS + review  Run review pass on a PR (checks code quality, posts findings) + merge  Attempt auto-merge (checks CI, conflicts, reviews, then merges) + cleanup Close stale PRs (older than configured days, default 14) + status Show all open Shipwright PRs with lifecycle state + patrol Run full PR lifecycle patrol (review + merge + cleanup) + help Show this help + +EXAMPLES + shipwright pr review 42 # Review PR #42 + shipwright pr merge 42 # Try to merge PR #42 + shipwright pr cleanup # Close stale PRs + shipwright pr status # Show all open PRs + shipwright pr patrol # Full lifecycle management + + ✓ merge without PR number exits non-zero + ✓ merge without arg shows error + + Cleanup Subcommand + ✓ cleanup runs + + Status Subcommand + ✓ status shows PR info + + Configuration + ✓ get_pr_config helper defined + ✓ auto_merge_enabled config check present + ✓ stale_days config present + + Event Emission + ✓ emits pr lifecycle events + ✓ emits pr.review_complete event + ✓ emits pr.merged event + + Code Quality Checks + ✓ review checks for HACK/TODO/FIXME patterns + ✓ review checks for console.log statements + + + + + ────────────────────────────────────────── + + All 0 tests passed + + +╔═══════════════════════════════════════════════════════════╗ +║ shipwright predictive test ║ +╚═══════════════════════════════════════════════════════════╝ + + +Risk Assessment + ▸ Risk returns valid schema with 0-100 range... ✓ + ▸ Risk elevates for complex keywords... ✓ + ▸ Risk emits event... ✓ + +Anomaly Detection + ▸ Critical at 3x threshold (181 vs 60 baseline)... ✓ + ▸ Normal at 2x (119 vs 60 baseline)... ✓ + ▸ Warning between 2x and 3x (150 vs 60)... ✓ + ▸ Normal when no baseline exists... ✓ + ▸ Emits event for critical anomaly... ✓ + +Baseline Management + ▸ First data point uses raw value... ✓ + ▸ EMA calculation (0.9*100 + 0.1*200 = 110)... ✓ + ▸ Creates baseline file if missing... ✓ + +Preventative Injection + ▸ Injects context from matching patterns... ✓ + ▸ Returns empty for non-matching stage... ✓ + +Graceful Degradation + ▸ Works without intelligence engine... ✓ + +AI Patrol + ▸ AI patrol returns structured findings... ✓ + +════════════════════════════════════════════════════ + All 15 tests passed ✓ +════════════════════════════════════════════════════ + + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright prep test — Validation Suite (Real Subprocess) ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up test environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-prep-test.D3gFot + + ▸ Node.js project detection... ✓ + ▸ Python project detection... ✓ + ▸ Go project detection... ✓ + ▸ Rust project detection... ✓ + ▸ settings.json is valid JSON... ✓ + ▸ Hook scripts are executable... ✓ + ▸ Hook scripts have valid syntax... ✓ + ▸ CLAUDE.md has required sections... ✓ + ▸ Check mode outputs scoring... ✓ + ▸ Idempotency without --force... ✓ + ▸ --force overwrites modified files... ✓ + ▸ No eval in generated hooks... ✓ + ▸ Definition of Done generated... ✓ + +━━━ Results ━━━ + Passed: 13 + Failed: 0 + Total: 13 + +All 13 tests passed! + + + + Shipwright PS Test Suite + ══════════════════════════════════════════ + + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present (via bootstrap or inline) + ✓ VERSION variable defined + ✓ VERSION is semver: 3.2.4 + + Function Definitions + ✓ format_idle() defined + ✓ get_status() defined + ✓ status_display() defined + + No tmux Sessions + ✓ shows no-agents message + ✓ suggests starting session + + With Agent Sessions + ✓ shows team window name + ✓ shows AGENT header + ✓ shows PID header + ✓ shows STATUS header + ✓ shows summary counts + + Window Filtering + ✓ shows claude- windows + ✓ filters out non-claude windows + + Status Classification + ✓ recognizes claude/node/npm/npx as active processes + ✓ recognizes shell processes as idle + ✓ uses 300-second idle threshold + ✓ handles dead pane status + + Idle Time Formatting + ✓ format_idle handles hours + ✓ format_idle handles seconds + + Header Display + ✓ shows Process Status header + ✓ shows timestamp + + Pane ID Usage + ✓ uses pane_id (not pane_index) + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Public Dashboard Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage text + ✓ help shows commands + ✓ unknown command shows error + ✓ config show outputs privacy + ✓ config privacy set succeeds + ✓ config domain set succeeds + ✓ config expiry set succeeds + ✓ config unknown key errors + ✓ list shows active links header + ✓ cleanup handles empty + ✓ export creates HTML file + ✓ exported HTML contains title + ✓ embed without token errors + ✓ embed iframe generates iframe tag + ✓ embed markdown generates markdown + ✓ embed link outputs URL + ✓ revoke without token errors + + + + ────────────────────────────────────────── + + All 17 tests passed + + + + shipwright quality test suite + ══════════════════════════════════════════ + + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ Source guard uses if/then/fi pattern + + Version + ✓ VERSION variable defined at top + ✓ version subcommand outputs version + + Help + ✓ help contains USAGE + ✓ help contains validate subcommand + ✓ help contains audit subcommand + ✓ help contains completion subcommand + ✓ help contains score subcommand + ✓ help contains gate subcommand + ✓ help contains report subcommand + ✓ --help flag works + + Error Handling +\033[38;2;0;212;255m\033[1mshipwright quality\033[0m — Ruthless Quality Validation Engine + +\033[1mUSAGE\033[0m + shipwright quality [options] + +\033[1mSUBCOMMANDS\033[0m + \033[38;2;0;212;255mvalidate\033[0m Multi-layer quality validation + - Test pass rate (must be 100%) + - Coverage threshold + - Uncommitted changes + - TODOs/FIXMEs in diff + - Hardcoded secrets + Output: JSON with scores + + \033[38;2;0;212;255maudit\033[0m Adversarial audit passes + - Security audit (injection, XSS, auth bypass, secrets) + - Correctness audit (logic errors, off-by-one, race conditions) + - Architecture audit (pattern violations, coupling) + Output: JSON with findings per category + + \033[38;2;0;212;255mcompletion\033[0m Intelligent build completion detection + - Analyze diminishing returns (< 10 lines in last 3 iterations) + - Check if tests went from failing to passing + - Check if goal subtasks are complete + Output: JSON with recommendation (continue|complete|escalate) + + \033[38;2;0;212;255mscore\033[0m Calculate comprehensive quality score + - Weighted: test_pass (30%), coverage (20%), security (20%), + architecture (15%), correctness (15%) + - Gate: score must exceed threshold (default 70) + Output: JSON with component scores and overall + + \033[38;2;0;212;255mgate\033[0m Pipeline quality gate + - Runs validate + score + - Exit code 0 if passes, 1 if fails + - Used by pipeline to gate progression + + \033[38;2;0;212;255mreport\033[0m Generate markdown quality report + - All checks, scores, audit findings + - Suitable for PR comment or documentation + Output: Markdown file + stdout + + \033[38;2;0;212;255mhelp\033[0m Show this help message + +\033[1mOPTIONS\033[0m + --artifacts-dir PATH Pipeline artifacts directory (default: ./.claude/pipeline-artifacts) + --coverage-threshold N Coverage threshold percentage (default: 70) + --quality-threshold N Overall quality score threshold (default: 70) + +\033[1mEXAMPLES\033[0m + shipwright quality validate + shipwright quality audit + shipwright quality completion + shipwright quality score --quality-threshold 75 + shipwright quality gate + shipwright quality report + shipwright quality gate && echo "Ready to deploy" + +\033[1mEXIT CODES\033[0m + 0 Quality checks passed + 1 Quality checks failed + + ✓ unknown command exits non-zero + ✓ unknown command shows error + + Validate Subcommand + ✓ validate outputs JSON with checks + ✓ validate outputs score + + Audit Subcommand + ✓ audit mentions security audit + ✓ audit mentions correctness audit + ✓ audit mentions architecture audit + + Completion Subcommand + ✓ completion outputs recommendation + ✓ completion outputs reasoning + + Score Subcommand + ✓ score outputs components + ✓ score outputs overall_score + + Events Logging + ✓ events.jsonl created after validate + ✓ events contain quality.validate + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + shipwright reaper test suite + ══════════════════════════════════════════ + + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present (via bootstrap or inline) + + Version + ✓ VERSION variable defined at top + + Help + ✓ help contains USAGE + ✓ help mentions --watch + ✓ help mentions --dry-run + ✓ help mentions --verbose + ✓ help mentions --interval + ✓ help mentions --grace-period + ✓ help mentions --log-file + ✓ help mentions DETECTION ALGORITHM + ✓ -h flag works + + Help Exit Code + ✓ --help exits 0 + + Error Handling + +shipwright reaper — Automatic pane cleanup when agents exit + +USAGE + shipwright reaper # One-shot: scan, reap, exit + shipwright reaper --watch # Continuous loop (5s interval) + shipwright reaper --dry-run # Preview what would be reaped + shipwright reaper --dry-run --verbose # Show all panes and their status + +OPTIONS + --watch Run continuously instead of one-shot + --dry-run Show what would be reaped without doing it + --verbose Show details for every pane scanned + --interval Seconds between watch scans (default: 5) + --grace-period Idle seconds before reaping (default: 15) + --log-file Append reaper activity to a log file + --help, -h Show this help + +DETECTION ALGORITHM + 1. pane_dead == 1 → REAP (zombie pane) + 2. command ∉ (bash,zsh,fish,sh) → SKIP (agent still running) + 3. pane_title is empty → SKIP (not initialized) + 4. pane_idle < grace_period → SKIP (may be starting) + 5. All checks passed → REAP (agent exited) + +EXAMPLES + shipwright reaper --watch --interval 10 --grace-period 30 + shipwright reaper --watch --log-file ~/.sw-reaper.log & + ✓ unknown option exits non-zero + ✓ unknown option shows error + + One-shot Mode + ✓ one-shot shows Agent Panes header + ✓ one-shot shows Empty Windows header + ✓ one-shot shows Team Directories header + ✓ one-shot shows healthy message + + Dry-run Mode + ✓ dry-run shows scan output + + Log File Option + ✓ --log-file without value exits non-zero + + PID File + ✓ PID_FILE variable defined + ✓ PID file uses .sw-reaper.pid + + + + + ────────────────────────────────────────── + + All 0 tests passed + + +═══ Section 1: Core Commands ═══ + +Testing roles command... PASS +ERROR: scripts/sw-recruit-test.sh:58 exited with status 1 +Testing roles count... PASS +Testing match command for build task... PASS +Testing match command for architecture task... PASS +Testing match command for security task... PASS +Testing match shows confidence and method... PASS +Testing team command... WARN: Unknown event type 'recruit_team' — update config/event-schema.json +PASS +Testing team composition for security issue... WARN: Unknown event type 'recruit_team' — update config/event-schema.json +PASS +Testing profiles command with no data... PASS +Testing stats command... PASS +Testing evaluate command with missing data... PASS +Testing onboard command for architect... WARN: Unknown event type 'recruit_onboarding' — update config/event-schema.json +PASS +Testing onboard command for builder... WARN: Unknown event type 'recruit_onboarding' — update config/event-schema.json +PASS +Testing help command... PASS +Testing team cost calculation... WARN: Unknown event type 'recruit_team' — update config/event-schema.json +PASS +Testing database initialization... PASS +Testing roles database content... PASS + +═══ Section 2: Feedback Loop ═══ + +Testing record-outcome creates profile... PASS +Testing record-outcome updates existing profile... PASS +Testing record-outcome calculates success rate... PASS +Testing profiles shows recorded agents... PASS +Testing evaluate uses recorded profile... PASS +Testing promote evaluates recorded agent... PASS + +═══ Section 3: Dynamic Roles ═══ + +Testing create-role manual... PASS +Testing roles count after create-role... PASS +Testing custom role has manual origin... PASS +Testing custom role in roles listing... PASS +Testing stats reflects custom roles... PASS + +═══ Section 4: Learning Infrastructure ═══ + +Testing match records history... PASS +Testing match history has correct fields... PASS +Testing role usage tracking... PASS +Testing heuristics DB initialization... PASS +Testing meta-learning DB initialization... PASS + +═══ Section 5: Population-Aware Evaluation ═══ + +Testing stats shows population stats... PASS +Testing evaluate mentions population thresholds... PASS +Testing specializations command... PASS + +═══ Section 6: AGI-Level Features ═══ + +Testing evolve command... WARN: Unknown event type 'recruit_evolve' — update config/event-schema.json +PASS +Testing reflect command... WARN: Unknown event type 'recruit_reflect' — update config/event-schema.json +PASS +Testing mind command without agent... PASS +Testing mind command with agent... WARN: Unknown event type 'recruit_mind' — update config/event-schema.json +PASS +Testing agent-minds DB created... PASS +Testing decompose command (fallback)... PASS +Testing self-tune command... WARN: Unknown event type 'recruit_self_tune' — update config/event-schema.json +PASS +Testing route command... PASS +Testing invent command... PASS + +═══ Section 7: Extended Matching ═══ + +Testing match optimizer for speed task... PASS +Testing match devops for docker task... PASS +Testing match incident-responder for outage... PASS + +═══ Section 8: Database Integrity ═══ + +Testing roles.json is valid JSON... PASS +Testing profiles.json is valid JSON... PASS +Testing heuristics.json is valid JSON... PASS +Testing meta-learning.json is valid JSON... PASS +Testing role-usage.json is valid JSON... PASS +Testing agent-minds.json is valid JSON... PASS +Testing builtin roles have origin field... PASS +Testing version is 3.0.0... PASS + +═══ Section 9: Adaptive Onboarding ═══ + +Testing onboard shows onboarding style... WARN: Unknown event type 'recruit_onboarding' — update config/event-schema.json +PASS +Testing onboard unknown role fails... PASS + +═══ Section 10: Help Coverage ═══ + +Testing help lists create-role command... PASS +Testing help lists record-outcome command... PASS +Testing help lists evolve command... PASS +Testing help lists reflect command... PASS +Testing help lists invent command... PASS +Testing help lists mind command... PASS +Testing help lists decompose command... PASS +Testing help lists self-tune command... PASS +Testing help lists route command... PASS +Testing help lists ingest-pipeline command... PASS + +═══ Section 11: Ingest Pipeline ═══ + +Testing ingest-pipeline with synthetic events... PASS +Testing ingest-pipeline creates profiles from events... PASS + +═══ Section 12: Error Paths ═══ + +Testing match with no args fails... PASS +Testing evaluate with no args fails... PASS +Testing record-outcome with missing args fails... PASS +Testing decompose with no args fails... PASS +Testing unknown command fails... PASS + +═══ Section 13: Route with Agent History ═══ + +Testing route finds best experienced agent... PASS + +═══ Section 14: Auto-Role Fallback ═══ + +Testing create-role --auto generates slug key (no Claude)... PASS +Testing create-role --auto fallback creates slugified key... PASS + +═══ Section 15: JSON Output Mode ═══ + +Testing match --json returns valid JSON... PASS +Testing match --json includes model field... PASS +Testing match --json includes confidence... PASS +Testing team --json returns valid JSON... WARN: Unknown event type 'recruit_team' — update config/event-schema.json +PASS +Testing team --json includes agents count... PASS +Testing team --json includes model... PASS +Testing team --json team array has members... PASS + +═══ Section 16: Infrastructure ═══ + +Testing jq dependency check exists in script... PASS +Testing flock-based locking helper exists... PASS +Testing agent_id field in match history records... PASS + +═══ Section 17: E2E Integration Flow ═══ + +Testing e2e: match provides model for pipeline... PASS +Testing e2e: team provides agent count for pipeline... WARN: Unknown event type 'recruit_team' — update config/event-schema.json +PASS +Testing e2e: pipeline event with agent_id ingests correctly... PASS +Testing e2e: route uses ingested profile for smart routing... PASS +Testing e2e: self-tune captures security keyword... PASS +Testing e2e: events trail shows recruit activity... PASS + +═══ Section 18: Integration Validation ═══ + +Testing sw-pipeline.sh has agent_id in pipeline.completed events... PASS +Testing sw-pipeline.sh has recruit model selection... PASS +Testing sw-pipeline.sh has auto-ingest after completion... PASS +Testing sw-pm.sh has recruit team integration... PASS +Testing sw-triage.sh has recruit team integration... PASS +Testing sw-loop.sh has recruit role assignment... PASS +Testing sw-loop.sh pulls role descriptions from recruit DB... PASS +Testing sw-swarm.sh has recruit-powered type selection... PASS + + --- Runtime Integration Contract Tests --- + +Testing CONTRACT: pipeline can parse recruit match --json for model... PASS +Testing CONTRACT: pipeline can parse recruit match --json for role... PASS +Testing CONTRACT: pm can parse recruit team --json for team/agents/model... PASS +Testing CONTRACT: triage can parse recruit team --json for template/max_iterations... PASS +Testing CONTRACT: swarm can map recruit match --json model to agent type... PASS +Testing CONTRACT: loop can parse recruit team --json for AGENT_ROLES... PASS +Testing CONTRACT: autonomous can safely build args from match --json... PASS +Testing CONTRACT: match records include match_id for outcome linkage... PASS +Testing CONTRACT: record-outcome backfills match outcome... PASS + +═══ Section 19: Policy Integration ═══ + +Testing recruit reads policy.json recruit section... PASS +Testing policy values are loaded into recruit variables... PASS +Testing policy self_tune_min_matches is configurable... PASS + +═══ Section 20: Meta Feedback Loop ═══ + +Testing meta-validation function exists... PASS +Testing reflect calls meta-validation... PASS +Testing meta-loop can detect declining accuracy... PASS +Testing auto self-tune triggers after ingest-pipeline... PASS +Testing auto evolve triggers after sufficient outcomes... PASS + +═══ Section 21: Negative-Compounding Audit ═══ + +Testing audit command exists... PASS +Testing audit runs and produces score... PASS +Testing audit emits recruit_audit event... PASS +Testing audit tracks score trend in meta-learning... PASS +Testing audit checks data stores... PASS +Testing audit checks feedback loops... PASS +Testing audit checks integration wiring... PASS +Testing audit checks policy governance... PASS +Testing audit checks automation triggers... PASS + +════════════════════════════════════════════════════════════════ +Results: 128 PASS | 0 FAIL +════════════════════════════════════════════════════════════════ + + + shipwright regression test suite + ══════════════════════════════════════════ + + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + ✓ Source guard uses if/then/fi pattern + + Version + ✓ VERSION variable defined at top + + Help + ✓ help contains USAGE + ✓ help contains baseline + ✓ help contains check + ✓ help contains report + ✓ help contains history + ✓ help contains METRICS TRACKED + ✓ help contains EXIT CODES + ✓ --help flag works + + Error Handling + +\033[38;2;0;212;255m\033[1mshipwright regression\033[0m — Detect regressions after merge + +\033[1mUSAGE\033[0m + shipwright regression [options] + +\033[1mCOMMANDS\033[0m + baseline [--save] Capture current metrics as baseline + check Compare current state against saved baseline (exit 1 if regressions) + report [--json|--md] Generate detailed regression report + history Show baseline history (last 10) + help Show this help + +\033[1mMETRICS TRACKED\033[0m + • Test count (must not decrease) + • Test suite pass rate (must not drop >5% by default) + • Total script line count (must not increase >20% by default) + • Script count (must not decrease) + • Function count (must not decrease) + • Bash syntax errors (must not increase) + +\033[1mBASELINE STORAGE\033[0m + Baselines stored in: ~/.shipwright/baselines/ + Latest symlink: ~/.shipwright/baselines/latest.json + Thresholds: ~/.shipwright/regression-thresholds.json + +\033[1mEXAMPLES\033[0m + \033[2m# Capture baseline after successful merge\033[0m + shipwright regression baseline --save + + \033[2m# Check for regressions before deploying\033[0m + shipwright regression check + + \033[2m# Generate a detailed report\033[0m + shipwright regression report --markdown + + \033[2m# View historical baselines\033[0m + shipwright regression history + +\033[1mEXIT CODES\033[0m + 0 No regressions detected + 1 Regressions found or error + + ✓ unknown command exits non-zero + ✓ unknown command shows error + + Baseline Subcommand + ✓ baseline shows metrics + ✓ baseline shows Test Count + ✓ baseline shows Pass Rate + ✓ baseline shows saved message + ✓ baseline JSON file created + ✓ latest.json symlink created + + State Files + ✓ regression-thresholds.json created + ✓ thresholds contain pass_rate_drop + ✓ thresholds contain test_count_decrease + + Check Subcommand + ✓ check without baseline exits non-zero + ✓ check without baseline shows error + + History Subcommand + ✓ history with no baselines shows warning + + Events Logging + ✓ events.jsonl created after baseline + ✓ events contain regression.baseline + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Release Manager Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help / version + ✓ help exits 0 + ✓ help shows USAGE + ✓ help mentions check + ✓ help mentions prepare + ✓ help mentions publish + ✓ help mentions rollback + ✓ VERSION variable defined + + error handling + ✓ Unknown command exits non-zero + + history command + ✓ history exits 0 + + stats command + ✓ stats exits 0 + + script safety + ✓ Uses set -euo pipefail + ✓ Has source guard pattern + + state management + ✓ Release state directory exists + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Release Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage + ✓ help shows commands + ✓ help shows prepare + ✓ help shows changelog + ✓ help shows tag + ✓ help shows publish + + Version Parsing + ✓ parse_version v1.2.3 + ✓ parse_version v10.20.30 + + Version Bumping + ✓ bump patch v1.2.3 -> v1.2.4 + ✓ bump minor v1.2.3 -> v1.3.0 + ✓ bump major v1.2.3 -> v2.0.0 + + Version Comparison + ✓ compare v1.2.3 == v1.2.3 + ✓ compare v1.2.3 < v1.3.0 + ✓ compare v2.0.0 > v1.9.9 + + Commit Type Extraction + ✓ get_commit_type feat + ✓ get_commit_type fix + + Status Command + ✓ status shows version + ✓ status shows header + + Tag Command + ✓ tag dry-run shows version + ✓ tag dry-run shows DRY RUN + ✓ tag invalid format exits non-zero + ✓ tag invalid format shows error + + Changelog Command + ✓ changelog shows from tag + ✓ changelog shows generated msg + + Error Handling + ✓ unknown command exits non-zero + ✓ unknown command shows error + + + + ────────────────────────────────────────── + + All 26 tests passed + + + Root Cause Classifier Tests + ══════════════════════════════════════════ + + + Classification + ✓ classify rate_limit + ✓ rate_limit confidence >80 + ✓ classify context_exhaustion + ✓ classify infra_issue (timeout) + ✓ classify infra_issue (OOM) + ✓ classify platform_bug + ✓ classify config_error + ✓ classify external_dep + ✓ classify code_bug (assertion) + ✓ classify code_bug (syntax) + ✓ classification returns valid JSON + + Error Log Analysis + ✓ handle missing error log + ✓ analyze error log entry count + + Fix Suggestions + ✓ rate_limit suggestion has backoff + ✓ context_exhaustion suggestion has max-restarts + ✓ infra_issue suggestion has disk check + ✓ platform_bug suggestion has doctor + ✓ config_error suggestion has daemon-config + ✓ suggestion returns valid JSON + + Learning System +WARN: Unknown event type 'rootcause.learned' — update config/event-schema.json + ✓ learning writes file + ✓ learning produces valid JSONL + ✓ learning accumulates entries + + Platform Issue Creation + ✓ skip issue creation with NO_GITHUB + ✓ skip low confidence issues + ✓ skip non-platform categories + + Report Generation + ✓ report handles missing history + ✓ report shows category distribution + ✓ report shows total count + + Integration + ✓ main workflow returns valid JSON + ✓ main workflow produces classification + ✓ main saves to learning file + + ────────────────────────────────────────── + + All 31 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright remote — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Machine Registry + ▸ Add machine creates machines.json... ▸ Checking shipwright installation at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-remote-test.6WKNxJ/mock-install... +✓ Shipwright found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-remote-test.6WKNxJ/mock-install +✓ Registered machine: builder-1 (localhost, worker, 4 workers) +✓ + ▸ Add second machine appends to array... ▸ Checking shipwright installation at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-remote-test.6WKNxJ/mock-install... +✓ Shipwright found at /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-remote-test.6WKNxJ/mock-install +✓ Registered machine: builder-2 (localhost, worker, 8 workers) +✓ + ▸ Remove machine by name... ✓ Removed machine: builder-2 +✓ + ▸ List machines returns output... ✓ + ▸ machines.json uses atomic writes... ✓ + ▸ Duplicate machine name prevented... ▸ Use shipwright remote remove builder-1 first +✓ + ▸ Remote script has help command... ✓ + +CLI & Dashboard Integration + ▸ CLI router includes remote command... ✓ + ▸ CLI router includes heartbeat command... ✓ + ▸ CLI router includes checkpoint command... ✓ + ▸ Doctor has heartbeat health check... ✓ + ▸ Doctor has remote machine checks... ✓ + ▸ Status shows heartbeat section... ✓ + ▸ Status shows remote machines section... ✓ + +════════════════════════════════════════════════════ + All 14 tests passed ✓ +════════════════════════════════════════════════════ + + + + Shipwright Replay Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage + ✓ help shows list + ✓ help shows show + ✓ help shows narrative + ✓ help shows diff + ✓ help shows export + ✓ help shows compare + + List Command + ✓ list with no events exits 0 + ✓ list with no events warns + + Show Command + ✓ show without issue exits non-zero + ✓ show shows usage + + Narrative Command + ✓ narrative without issue exits non-zero + ✓ narrative shows usage + + Diff Command + ✓ diff without issue exits non-zero + ✓ diff shows usage + + Export Command + ✓ export without issue exits non-zero + ✓ export shows usage + + Compare Command + ✓ compare without args exits non-zero + ✓ compare shows usage + + Missing Data + ✓ show non-existent issue exits non-zero + ✓ show non-existent issue says not found + + List With Events + ✓ list shows pipeline runs header + + Error Handling + ✓ unknown command exits non-zero + ✓ unknown command shows error + + Pipeline Events Fixture (Issue #42) + ✓ list shows issue 42 + ✓ show 42 has stage information + ✓ show 42 has plan stage + ✓ show 42 has build stage + ✓ show 42 has Pipeline Type + ✓ narrative 42 produces prose + ✓ narrative 42 has stages count + ✓ export 42 produces report + ✓ export 42 has JSON-structured events + ✓ export 42 has stage table + + Diff/Compare Error Handling + ✓ diff without issue exits non-zero + ✓ diff without issue shows usage + ✓ compare with missing second issue exits non-zero + ✓ compare missing arg shows usage + + + + ────────────────────────────────────────── + + All 38 tests passed + + + + Shipwright Retro Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ unknown command exits 1 + ✓ summary shows Sprint Summary + ✓ summary outputs valid JSON + ✓ history handles no retros + ✓ actions shows improvements + ✓ VERSION is defined + ✓ format_duration handles hours + ✓ format_duration handles minutes + ✓ format_duration handles seconds + ✓ actions generated for low quality + ✓ actions generated for high retries + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Scale Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage + ✓ help shows up + ✓ help shows down + ✓ help shows rules + ✓ help shows status + ✓ help shows history + + Rules Command + ✓ rules show has iteration_threshold + ✓ rules show has max_team_size + ✓ rules creates default file + ✓ rules set confirms update + ✓ rules set persists value + ✓ rules reset confirms + ✓ rules reset restores default + ✓ rules set without args exits non-zero + + Up Command + ✓ up records scale event + ✓ up creates scale events file + ✓ up with invalid role exits non-zero + ✓ up invalid role shows error + + Down Command + ✓ down without agent-id exits non-zero + ✓ down shows usage + ✓ down records scale event + + Status Command + ✓ status shows header + ✓ status shows team size + ✓ status shows max team size + + History Command + ✓ history shows header + ✓ history with no events warns + + Recommend Command + ✓ recommend shows header + ✓ recommend shows thresholds + + Error Handling + ✓ unknown command exits non-zero + ✓ unknown command shows error + + + + ────────────────────────────────────────── + + All 30 tests passed + + + Shipwright Stall Detector Test v3.2.4 + ════════════════════════════════════════════ + + ▸ classify: heartbeat stale... ✓ + ▸ classify: loop detected... ✓ + ▸ config: show defaults... ✓ + ▸ check: empty heartbeats... ✓ + ▸ check: fresh heartbeat... ✓ + ▸ check: stale heartbeat... ✓ + ▸ check: JSON output format... ✓ + ▸ check: multiple stalled pipelines... ✓ + ▸ abort: no heartbeat error... ✓ + ▸ abort: with heartbeat... ⚠ Pipeline process 99999 not running (already dead) +✓ + ▸ status: no detector running... ✓ + ▸ help: display help... ✓ + +════════════════════════════════════════════ + Results + ✓ Passed: 12 + ✗ Failed: 0 + Total: 12 + + + + Shipwright Security Audit Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ unknown command exits 1 + ✓ secrets scan completes on clean repo + ✓ license scan completes + ✓ SBOM file created + ✓ SBOM is valid JSON + ✓ permissions audit completes + ✓ compliance report file created + ✓ VERSION is defined + + + + ────────────────────────────────────────── + + All 0 tests passed + + +━━━ shipwright self-optimize tests ━━━ + + ▸ Outcome analysis extracts correct metrics... ✓ + ▸ Outcome analysis emits event... ✓ + ▸ Outcome analysis rejects missing file... ✓ + ▸ Template weight increases for high success... ✓ + ▸ Template weight decreases for low success... ✓ + ▸ A/B test selects ~20% sample... ✓ + ▸ Iteration model updates with data points... ✓ + ▸ Model routing tracks success rates... ✓ + ▸ Model routing keeps opus with few sonnet samples... ✓ + ▸ Memory pruning removes old patterns... ✓ + ▸ Memory strengthening boosts confirmed patterns... ✓ + ▸ Memory promotion copies cross-repo patterns... ✓ + ▸ Full analysis runs on empty data... ✓ + ▸ Report generates output with data... ✓ + ▸ Report handles empty outcomes... ✓ + ▸ Outcome analysis extracts stage data... ✓ + ▸ Template weights output has .weights wrapper... jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +✓ + ▸ Iteration model output has flat format... jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +✓ + ▸ Model routing output has .routes wrapper... jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +✓ + ▸ Full analysis creates last-report.txt... jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +jq: invalid JSON text passed to --argjson +Use jq --help for help with command-line options, +or see the jq manpage, or online docs at https://jqlang.github.io/jq +✓ + ▸ Context efficiency skips with no events... ✓ + ▸ Context efficiency detects high budget utilization... ✓ + ▸ Context efficiency detects high trim ratio... ✓ + ▸ Context efficiency reports healthy when metrics normal... ✓ + ▸ Full analysis includes context efficiency... ✓ + +━━━ Results ━━━ + Passed: 25 + Failed: 0 + Total: 25 + +All 25 tests passed! + + +╔═══════════════════════════════════════════════════╗ +║ shipwright session — E2E Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... +Test tmux session: sw-test-40677 + +Template Loading + ▸ Load feature-dev template (3 agents)... ✓ + ▸ Load exploration template (2 agents)... ✓ + ▸ Missing template returns error... ✓ + ▸ All 24 templates load successfully... ✓ + +Window Creation + ▸ No-launch creates window without claude... ✓ + ▸ Duplicate window detection... ✓ + ▸ Window gets dark theme... ✓ + ▸ Auto-generated team name... ✓ + +Prompt & Launcher + ▸ Launcher script generation with template... ✓ + ▸ Output includes agents from template... ✓ + ▸ Output includes goal when provided... ✓ + ▸ No template + goal works... ✓ + ▸ No template + no goal = interactive... ✓ + +CLI & Configuration + ▸ Help flag... ✓ + ▸ Unknown option... ✓ + ▸ TMPDIR used for launcher files... ✓ + +Enhanced Features + ▸ Template auto-suggestion from goal... ✓ + ▸ No suggestion without goal... ✓ + ▸ Memory injection in prompt... ✓ + ▸ CLAUDE.md reminder in prompt... ✓ + ▸ Secure temp dir in source... ✓ + +════════════════════════════════════════════════════ + All 21 tests passed ✓ +════════════════════════════════════════════════════ + + + + shipwright setup test suite + ══════════════════════════════════════════ + + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present + + Version + ✓ VERSION variable defined at top + + Help + ✓ help mentions Usage + ✓ help mentions Phase 1 + ✓ help mentions Phase 2 + ✓ help mentions Phase 3 + ✓ help mentions Phase 4 + ✓ help mentions --skip-daemon-prompt + ✓ -h flag works + + Help Exit Code + ✓ --help exits 0 + + Phase Structure + ✓ PHASE 1: PREREQUISITES defined + ✓ PHASE 2: REPO ANALYSIS defined + ✓ PHASE 3: CONFIGURATION GENERATION defined + ✓ PHASE 4: VALIDATION defined + + Prerequisites Detection + ✓ REQUIRED_TOOLS array defined + ✓ checks for tmux + ✓ checks for bash + ✓ checks for git + ✓ checks for jq + ✓ checks for gh + ✓ checks for claude + + Language Detection + ✓ detects Node.js via package.json + ✓ detects Rust via Cargo.toml + ✓ detects Go via go.mod + ✓ detects Python via pyproject.toml + + Skip Daemon Prompt + ✓ SKIP_DAEMON_PROMPT flag defined + ✓ --skip-daemon-prompt flag handled + + Subprocess Calls + ✓ calls sw-init.sh for config generation + ✓ calls sw-doctor.sh for validation + + OS Detection + ✓ detect_os function present + ✓ handles macOS + ✓ handles Linux + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Standup Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help / version + ✓ help exits 0 + ✓ help shows USAGE + ✓ help mentions digest + ✓ help mentions yesterday + ✓ help mentions blockers + ✓ help mentions velocity + ✓ help mentions notify + ✓ VERSION variable defined + + error handling + ✓ Unknown command exits non-zero + + yesterday command + ✓ yesterday exits 0 + + today command + ✓ today exits 0 + + blockers command + ✓ blockers exits 0 + + velocity command + ✓ velocity exits 0 + + history command + ✓ history exits 0 + + script safety + ✓ Uses set -euo pipefail + ✓ Has source guard pattern + + state management + ✓ Standups directory exists + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Status Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + JSON Output + ✓ --json produces valid JSON + ✓ All 11 top-level keys present + ✓ Version is semver format (3.2.4) + ✓ Timestamp is ISO-8601 UTC (2026-03-09T11:35:46Z) + ✓ tmux_windows has Claude windows + ✓ teams has 1 team + ✓ team has 3 members + ✓ task_lists total is 4 + ✓ task_lists completed is 2 + ✓ task_lists in_progress is 1 + ✓ daemon.running is false (mock kill fails) + ✓ daemon has 1 active job + ✓ daemon has 2 queued issues + ✓ daemon has 2 recent completions + ✓ heartbeats has 1 entry + ✓ heartbeat stage is build + ✓ remote_machines has 1 machine + ✓ issue_tracker provider is linear + ✓ connected_developers total_online is 1 + ✓ No ANSI escape codes in JSON + + CLI Flags + ✓ --help mentions --json + + Human-Readable Output + ✓ Human output has 'TMUX WINDOWS' header + ✓ Human output has 'TEAM CONFIGS' header + ✓ Human output has 'TASK LISTS' header + ✓ Human output has 'DAEMON PIPELINES' header + + Empty State + ✓ Empty state produces valid JSON + ✓ Empty state daemon is null + ✓ Empty state teams is [] + + Subsection Queries + ✓ daemon.active_jobs[].issue queryable + ✓ daemon.queued queryable + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Strategic Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help + ✗ help shows usage + output missing: Usage + ✗ help shows commands + output missing: Commands + ✗ unknown command shows error + output missing: Unknown command + Sourced Functions + + + Shipwright Stream Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage + ✓ help shows start + ✓ help shows stop + ✓ help shows watch + ✓ help shows list + ✓ help shows replay + + List Command + ✓ list shows no streams msg + + Stop Command + ✓ stop when not running exits non-zero + ✓ stop shows not running msg + + Config Command + ✓ config set confirms update + ✓ config creates config file + ✓ config persists interval value + ✓ config without key exits non-zero + ✓ config without key shows usage + ✓ config unknown key exits non-zero + ✓ config unknown key shows error + + Replay Command + ✓ replay without args exits non-zero + ✓ replay shows usage + ✓ replay missing data exits non-zero + ✓ replay missing data shows error + + Watch Command + ✓ watch without team exits non-zero + ✓ watch shows usage + + List With Data + ✓ list shows active stream + + Error Handling + ✓ unknown command exits non-zero + ✓ unknown command shows error + + + + ────────────────────────────────────────── + + All 25 tests passed + + + + Shipwright Swarm Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ status shows empty swarm + ✓ spawn standard creates agent + ✓ registry.json exists after spawn + ✓ active_count is 1 after spawn + ✓ config.json exists after operations + ✓ spawn invalid type returns error + ✓ health shows agent status + ✓ top shows leaderboard + ✓ config show displays settings + ✓ config set updates value + ✓ config reset restores defaults + ✓ unknown command exits 1 + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Team Stages Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help + ✓ help shows usage + ✓ help shows commands + ✓ --help flag works + ✓ unknown command shows error + Compose + ✓ compose build outputs JSON with stage + ✓ compose output is valid JSON + ✓ compose stage is build + ✓ compose test includes tester specialist + ✓ compose review includes reviewer specialist + Roles + ✓ roles shows builder + ✓ roles shows reviewer + ✓ roles shows tester + ✓ roles shows security + ✓ roles shows docs + Status + ✓ status with no teams + Delegate + ✓ delegate produces tasks array + ✓ delegate assigns files to tasks + + + + ────────────────────────────────────────── + + All 17 tests passed + + + + shipwright templates test suite + ══════════════════════════════════════════ + + + Script Safety + ✓ set -euo pipefail present + ✓ ERR trap present (via bootstrap or inline) + + Version + ✓ VERSION variable defined at top + + Help + ✓ help contains USAGE + ✓ help contains list subcommand + ✓ help contains show subcommand + ✓ help mentions TEMPLATE LOCATIONS + ✓ help mentions CREATING TEMPLATES + ✓ --help flag works + + Error Handling + + shipwright templates — Browse and inspect team templates + + USAGE + shipwright templates list List available templates + shipwright templates show Show template details + + TEMPLATE LOCATIONS + ~/.shipwright/templates/ Custom templates (takes priority) + /Users/sethford/Documents/shipwright/.worktrees/pipeline-issue-122/tmux/templates/ Built-in templates + + CREATING TEMPLATES + Drop a JSON file in ~/.shipwright/templates/: + + { + "name": "my-template", + "description": "What this team does", + "agents": [ + {"name": "agent-1", "role": "Does X", "focus": "src/"}, + {"name": "agent-2", "role": "Does Y", "focus": "tests/"} + ], + "layout": "tiled" + } + + ✓ unknown subcommand exits non-zero + ✓ unknown subcommand shows error + + List Subcommand + ✓ list header shows Team Templates + + Custom Templates + ✓ list shows custom template name + ✓ list shows custom template description + + Show Subcommand + ✓ show displays template name + ✓ show displays description + ✓ show displays Agents header + ✓ show displays agent name + + Show Without Name + Usage: shipwright templates show  + ✓ show without name exits non-zero + ✓ show without name shows error + + Show Nonexistent Template + + Available templates: + my-custom + accessibility + api-design + architecture + bug-fix + code-review + compliance + data-pipeline + debt-paydown + devops + doc-fleet + documentation + exploration + feature-dev + full-stack + i18n + incident-response + migration + observability + onboarding + performance + refactor + release + security-audit + spike + testing + + ✓ show nonexistent template exits non-zero + ✓ show nonexistent shows not found + + Subcommand Aliases + ✓ ls alias works (shows Team Templates) + ✓ info alias works (shows template) + + Template Directories + ✓ defines USER_TEMPLATES_DIR + ✓ uses ~/.shipwright/templates + ✓ uses tmux/templates for built-in + + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Testgen Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + ✓ help shows usage text + ✓ help exits 0 + ✓ --help flag works + ✓ unknown command exits 1 + ✓ coverage shows analysis + ✓ coverage JSON is valid + ✓ threshold show outputs value + ✓ threshold set confirms + ✓ quality scoring runs + ✓ quality on missing file exits nonzero + ✓ gaps shows untested functions + ✓ VERSION is defined + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright tmux-pipeline Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help + ✓ help shows usage + ✓ help shows commands + ✓ --help flag works + ✓ unknown command shows error + Spawn + ✓ spawn without issue errors + ✓ spawn existing window warns + ✓ spawn new pipeline succeeds + ✓ heartbeat has job_id + ✓ heartbeat has pane_id + List + ✓ list shows pipeline windows + ✓ list shows pipeline-42 + Capture + ✓ capture without issue errors + ✓ capture shows output + Kill + ✓ kill succeeds + ✓ kill removes heartbeat file + Layout + ✓ layout tiled succeeds + ✓ layout horizontal succeeds + ✓ layout bogus errors + Attach + ✓ attach without issue errors + + + + ────────────────────────────────────────── + + All 19 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright tmux — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +tmux Doctor + ▸ doctor runs without error... PASS + ▸ doctor outputs pass/warn/fail counts... PASS + ▸ doctor reports all pass with correct options... PASS + ▸ doctor detects bad escape-time... PASS + ▸ doctor detects bad history-limit... PASS + ▸ doctor detects passthrough off... PASS + ▸ doctor detects tmux version... PASS + ▸ doctor detects terminal emulator... PASS + +tmux Install + ▸ install creates TPM directory... PASS + ▸ install skips if TPM already exists... PASS + ▸ install copies overlay file... PASS + +tmux Fix + ▸ fix applies fixes for bad options... PASS + ▸ fix reports no fixes when all correct... PASS + ▸ fix outside tmux returns error... + Shipwright — tmux Auto-Fix + ══════════════════════════════════════════ + +PASS + +tmux Reload + ▸ reload calls source-file... PASS + ▸ reload outside tmux returns error... PASS + ▸ reload with missing .tmux.conf prints error... PASS + +CLI Routing + ▸ CLI routes doctor command... PASS + ▸ Help contains all subcommands... PASS + ▸ Default shows help... PASS + ▸ Unknown command exits with error... PASS + ▸ CLI aliases work (check, setup)... PASS + +Edge Cases + ▸ Doctor outside tmux checks config file... PASS + ▸ Doctor detects missing TPM... PASS + ▸ Doctor warns on old tmux version... PASS + + ────────────────────────────────────────── + 25 passed 0 failed (25 total) + + + + Shipwright Trace Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage + ✓ help shows show + ✓ help shows list + ✓ help shows search + ✓ help shows export + + Show Command + ✓ show without issue exits non-zero + ✓ show without issue shows error + ✓ show displays ISSUE section + ✓ show displays issue title + ✓ show displays PIPELINE section + ✓ show displays PULL REQUEST section + ✓ show displays DEPLOYMENT section + + List Command + ✓ list with no events exits non-zero + ✓ list with no events warns + ✓ list shows header + + Search Command + ✓ search without --commit exits non-zero + ✓ search shows usage + ✓ search --commit without sha exits non-zero + + Export Command + ✓ export without issue exits non-zero + ✓ export without issue shows error + + Error Handling + ✓ unknown command exits non-zero + ✓ unknown command shows error + + + + ────────────────────────────────────────── + + All 22 tests passed + + +╔═══════════════════════════════════════════════════╗ +║ shipwright tracker — Test Suite ║ +╚═══════════════════════════════════════════════════╝ + +Setting up test environment... + +Provider Loading + ▸ Provider loads from config (linear)... ✓ + ▸ Provider loads jira... ✓ + ▸ Graceful skip when no provider... ✓ + +Pipeline Enrichment + ▸ Stage descriptions exist for all 12 stages... ✓ + ▸ Enriched progress body has Delivering line... ✓ + ▸ Enriched progress body has stage descriptions... ✓ + ▸ Pipeline state includes stage_progress... ✓ + ▸ Pipeline state includes stage description... ✓ + +Integration + ▸ Tracker notify routes to provider (mock)... ✓ + ▸ Dashboard reads goal from pipeline state... ✓ + ▸ Jira config validation... ✓ + ▸ Linear config migration (legacy fallback)... ✓ + +GitHub Adapter + ▸ GitHub adapter file exists... ✓ + ▸ GitHub adapter has provider_discover_issues... ✓ + ▸ GitHub adapter has provider_get_issue... ✓ + ▸ GitHub adapter has provider_comment... ✓ + ▸ GitHub adapter has provider_create_issue... ✓ + ▸ GitHub adapter NO_GITHUB guard works... ✓ + ▸ GitHub adapter normalize jq produces correct schema... ✓ + +════════════════════════════════════════════════════ + All 19 tests passed ✓ +════════════════════════════════════════════════════ + + + + Shipwright Triage Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows subcommands + ✓ --help exits 0 + + error handling + ✓ unknown subcommand exits 1 + ✓ unknown shows error + + GitHub guard + ✓ analyze exits 1 with NO_GITHUB=1 + ✓ analyze shows disabled + ✓ analyze without args exits 1 + ✓ team without args exits 1 + + internal analysis functions + ✓ analyze_type detects security + ✓ analyze_type detects bug + ✓ analyze_type detects feature + ✓ short text = trivial complexity + ✓ security text = high risk + ✓ trivial+low = xs effort + ✓ suggest_labels includes type + ✓ suggest_labels includes risk + ✓ suggest_labels includes priority + + triage team offline fallback + ✓ team works offline with recruit (produced recommendation) + ✓ team offline output has pipeline_template + ✓ team offline uses recruit source + ✓ team offline without recruit uses heuristic defaults + + + + ────────────────────────────────────────── + + All 23 tests passed + + +shipwright upgrade — test suite + +Script structure + ✓ VERSION variable defined at top of sw-upgrade.sh + ✓ set -euo pipefail present + ✓ ERR trap defined + ✓ Standard color definitions present + ✓ Output helper info() defined + ✓ Output helper success() defined + ✓ Output helper warn() defined + ✓ Output helper error() defined + +Upgrade check (dry run) + ✓ Dry run shows comparing text + ✓ Manifest file created on first run + ✓ Manifest is valid JSON + ✓ Manifest has schema field + ✓ Manifest has repo_path + +Upgrade detection + ✓ Detects up-to-date files + ✓ Shows SUMMARY line + ✓ Suggests --apply flag + +Apply mode + ✓ Apply mode shows Applying + ✓ Apply mode completes successfully + +File registry + ✓ FILES array has core scripts (found 64 entries) + ✓ FILES array has tmux templates (found 12) + ✓ FILES array has pipeline templates (found 9) + ✓ Protected files defined in FILES array + +Repo location logic + ✓ find_repo uses --repo-path + ✓ find_repo uses SHIPWRIGHT_REPO_PATH + +Checksum logic + ✓ file_checksum function defined + ✓ Uses md5/md5sum for checksums + ✓ Backup logic (.pre-upgrade.bak) present + ✓ Self-upgrade detection logic present + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright UX Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Basic + ✓ help shows usage + ✓ help shows subcommands + ✓ --help flag works + ✓ unknown command shows error + Theme System + ✓ theme list shows dark + ✓ theme list shows cyberpunk + ✓ theme list shows ocean + ✓ theme set dark succeeds + ✓ theme preview shows colors + Config + ✓ config show outputs theme + ✓ config file has theme key + ✓ config file has spinner key + ✓ config reset succeeds + Spinners + ✓ spinner list shows spinners + ✓ spinner list shows spinner frames + Shortcuts + ✓ shortcuts shows key bindings + Accessibility + ✓ high contrast mode enabled + ✓ reduced motion mode enabled + ✓ screen reader mode enabled + + Sourced Functions + ✓ hex_to_rgb #00d4ff yields 0;212;255 + ✓ hex_to_rgb #4ade80 yields 74;222;128 + ✓ get_color primary returns output + ✓ get_color secondary returns output + ✓ get_color success returns output + ✓ get_color warning returns output + ✓ get_color error returns output + ✓ box_title contains title text + ✓ box_title has box drawing + ✓ format_diff_line +line contains text + ✓ format_diff_line -line contains text + ✓ format_diff_line + produces output + ✓ format_diff_line - produces output + + Config Persistence +✓ Theme set to: cyberpunk + ✓ config.json updated with theme + + + + ────────────────────────────────────────── + + All 33 tests passed + + + + Shipwright Webhook Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + help command + ✓ help exits 0 + ✓ help shows usage + ✓ help shows commands + ✓ --help exits 0 + + status subcommand + ✓ status exits 0 + ✓ status shows NOT running + ✓ status shows configuration + + secret subcommand + ✓ secret show exits 0 + ✓ secret show returns long string + ✓ webhook-secret file created + ✓ secret regenerate exits 0 + ✓ regenerate confirms + ✓ secret regenerated is different + + stop subcommand + ✓ stop exits 0 when no server + ✓ stop says not running + + logs subcommand + ✓ logs exits 0 with no logs + ✓ logs says no logs + + setup subcommand + ✓ setup without args exits 1 + ✓ setup shows usage + ✓ setup with bad format exits 1 + ✓ setup shows invalid format + ✓ test without args exits 1 + + error handling + ✓ unknown command exits 1 + ✓ unknown command shows error + + webhook event processing +▸ Webhook: Issue #42 labeled 'shipwright' in test/repo +PROCESSED + ✓ process_webhook_event returns 0 for labeled issue + + + + ────────────────────────────────────────── + + All 25 tests passed + + +shipwright widgets — test suite + +Script structure + ✓ VERSION variable defined at top of sw-widgets.sh + ✓ set -euo pipefail present + ✓ ERR trap defined + ✓ Source guard pattern (BASH_SOURCE check) present + ✓ Standard color definitions present + ✓ Output helper info() defined + ✓ Output helper success() defined + ✓ Output helper warn() defined + ✓ Output helper error() defined + +Help command + ✓ help exits 0 + ✓ --help exits 0 + ✓ help contains USAGE + ✓ help lists badge command + ✓ help lists slack command + ✓ help lists markdown command + ✓ help lists json command + ✓ help lists notify command + +Version command + ✓ version command outputs version string + +Unknown command + ✓ Unknown command exits non-zero + ✓ Unknown command shows error + +Badge command + ✓ badge pipeline returns shields.io URL + ✓ badge tests returns shields.io URL + ✓ badge version returns shields.io URL + ✓ badge health returns shields.io URL + ✓ badge all shows Pipeline + ✓ badge all shows Tests + ✓ badge all shows Version + ✓ badge all shows Health + ✓ badge unknown type exits non-zero + +Markdown command + ✓ markdown contains Status Badges header + ✓ markdown contains Pipeline Status section + ✓ markdown contains shields.io URLs + ✓ markdown contains Shipwright attribution + +JSON command + ✓ json command outputs valid JSON + ✓ json contains timestamp field + ✓ json contains pipeline.status + ✓ json contains badges.pipeline + +Notify command + ✓ notify always shows pipeline status + ✓ notify success with unknown pipeline status produces no output + ✓ notify unknown type exits non-zero + +Pipeline status detection + ✓ Script reads pipeline-state.md for status + ✓ Script handles pipeline status states + ✓ badge pipeline produces output + +Slack command validation + ✓ slack without webhook URL exits non-zero + ✓ slack error mentions webhook + +No-args defaults to help + ✓ No args defaults to help (shows USAGE) + + + + ────────────────────────────────────────── + + All 0 tests passed + + + + Shipwright Worktree Tests + ══════════════════════════════════════════ + + ══════════════════════════════════════════ + + Help & Version + ✓ help shows usage + ✓ help shows create + ✓ help shows list + ✓ help shows sync + ✓ help shows merge + ✓ help shows remove + ✓ help shows cleanup + ✓ help shows status + + List Command + ✓ list with no worktrees + + Create Command + ✓ create without name exits non-zero + ✓ create without name shows usage + ✓ create confirms worktree + + Sync Command + ✓ sync without name exits non-zero + ✓ sync shows usage + + Merge Command + ✓ merge without name exits non-zero + ✓ merge shows usage + + Remove Command + ✓ remove without name exits non-zero + ✓ remove shows usage + + Status Command + ✓ status with no worktrees + + Cleanup Command + ✓ cleanup with no worktrees + + Sync-All Command + ✓ sync-all with no worktrees + + Error Handling + ✓ unknown command exits non-zero + ✓ unknown command shows error + + + + ────────────────────────────────────────── + + All 23 tests passed + + + Lib: compat Tests + ══════════════════════════════════════════ + + + Platform detection + ✓ is_macos returns true on macOS + ✓ is_linux returns false on macOS + + tmp_dir + ✓ tmp_dir returns existing directory: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T/ + ✓ tmp_dir respects TMPDIR + + pid_exists + ✓ pid_exists detects current process (69584) + ✓ pid_exists returns false for non-existent PID + + sw_valid_error_category + ✓ Built-in category 'test_failure' is valid + ✓ Built-in category 'build_error' is valid + ✓ Built-in category 'lint_error' is valid + ✓ Built-in category 'timeout' is valid + ✓ Built-in category 'dependency' is valid + ✓ Built-in category 'flaky' is valid + ✓ Built-in category 'config' is valid + ✓ Built-in category 'security' is valid + ✓ Built-in category 'permission' is valid + ✓ Built-in category 'unknown' is valid + ✓ Invalid category 'not_a_real_category' returns false + ✓ Custom taxonomy category 'custom_cat' is valid + ✓ Custom taxonomy category 'another_cat' is valid + + complexity_bucket + ✓ Complexity 1 is low + ✓ Complexity 2 is low + ✓ Complexity 3 is low + ✓ Complexity 4 is medium + ✓ Complexity 5 is medium + ✓ Complexity 6 is medium + ✓ Complexity 7 is high + ✓ Complexity 10 is high + ✓ Custom boundary: 4 is low + ✓ Custom boundary: 5 is low + ✓ Custom boundary: 6 is medium + ✓ Custom boundary: 8 is medium + ✓ Custom boundary: 9 is high + + detect_primary_language + ✓ TypeScript detected + ✓ JavaScript detected + ✓ Python detected + ✓ Go detected + ✓ Rust detected + ✓ Java (Gradle) detected + ✓ Java (Maven) detected + ✓ Elixir detected + ✓ Unknown for empty dir + + detect_test_framework + ✓ Vitest framework + ✓ Jest framework + ✓ Mocha framework + ✓ pytest framework + ✓ Go test framework + ✓ Cargo test framework + ✓ Gradle test framework + ✓ No framework for empty dir + + compute_md5 + ✓ compute_md5 --string returns 32-char hash + ✓ compute_md5 is deterministic + ✓ Different inputs produce different hashes + ✓ compute_md5 file returns 32-char hash + + file_mtime + ✓ file_mtime returns numeric epoch for existing file + ✓ file_mtime returns 0 for nonexistent file + + sed_i + ✓ sed_i replaces in-place + + ────────────────────────────────────────── + + All 56 tests passed + + + Lib: helpers Tests + ══════════════════════════════════════════ + + + Output helpers + ✓ info outputs message + ✓ success outputs message + ✓ warn outputs message + ✓ error outputs message + + Timestamp helpers + ✓ now_iso format + ✓ now_epoch is numeric + ✓ now_epoch is a reasonable timestamp + + emit_event +WARN: Unknown event type 'test.event' — update config/event-schema.json + ✓ Events file created + ✓ Event has type + ✓ Event has string field + ✓ Event has numeric field + ✓ Event has timestamp + ✓ Event has epoch + ✓ Event line is valid JSON + ✓ Two events produce two lines + ✓ Event with quotes is valid JSON + + with_retry + ✓ with_retry succeeds on first try +⚠ Attempt 1/2 failed (exit 1), retrying in 1s... + ✓ with_retry fails after 2 attempts +⚠ Attempt 1/3 failed (exit 1), retrying in 1s... + ✓ with_retry succeeds on second attempt + ✓ Flaky command ran exactly 2 times + + validate_json + ✓ validate_json passes for valid JSON + ✓ Backup created +⚠ Corrupt JSON detected: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-lib-helpers-test.MaTzle/corrupt.json +⚠ Recovered from backup: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-lib-helpers-test.MaTzle/corrupt.json.bak + ✓ validate_json recovers from backup + ✓ Recovered content is valid +⚠ Corrupt JSON detected: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-lib-helpers-test.MaTzle/nobackup.json + ✓ validate_json fails for corrupt JSON with no backup + ✓ validate_json passes for non-existent file + + rotate_jsonl + ✓ Under-limit file not rotated + ✓ Over-limit file rotated to 10 lines + ✓ Keeps most recent lines + ✓ rotate_jsonl handles nonexistent file + + Project identity + ✓ github_repo extracts from remote + ✓ _sw_github_owner returns non-empty: testuser + ✓ _sw_docs_url contains github.io + ✓ _sw_github_url contains github.com + + ────────────────────────────────────────── + + All 34 tests passed + + + Lib: error-actionability Tests + ══════════════════════════════════════════ + + + File Path Detection + ✓ component: absolute path + ✓ component: relative path + ✓ score: no path scores 0 + + Line Number Detection + ✓ component: colon format + ✓ component: 'at line' format + ✓ score: no line number scores 0 + + Error Type Detection + ✓ component: TypeError + ✓ component: SyntaxError + ✓ component: ENOENT + ✓ component: Generic Error + + Actionable Detail Detection + ✓ component: 'cannot' keyword + ✓ component: 'does not exist' + ✓ component: 'permission denied' + ✓ component: 'is not a function' + ✓ component: 'failed to' + + Fix Suggestion Detection + ✓ component: 'try' keyword + ✓ component: 'check' keyword + ✓ component: 'ensure' keyword + ✓ component: 'run' keyword + ✓ component: 'remove' keyword + + Score Calculations + ✓ score: all 5 components + ✓ score: 3 components (path+line+error) + ✓ score: 2 components (actionable+fix) + ✓ score: no components + + Enhancement Thresholds + ✓ enhancement: high score (>=70) + ✓ enhancement: low score (<70) + ✓ enhancement: threshold at 70 + + Error Categorization + ✓ category: FILE_ACCESS + ✓ category: FUNCTION_ERROR + ✓ category: SYNTAX_ERROR + ✓ category: TYPE_ERROR + ✓ category: ASSERTION_FAILURE + ✓ category: TIMEOUT + ✓ category: MEMORY_ERROR + ✓ category: NETWORK_ERROR + + Location Extraction + ✓ file:line format + ✓ file extraction + ✓ at line format + ✓ File format + + Error Deduplication + ✓ deduplicates repeated lines + ✓ ANSI codes stripped + + Error Classification + ✓ classify syntax error + ✓ classify type error + ✓ classify assertion + ✓ classify runtime error + ✓ classify dependency error + ✓ classify permission error + ✓ classify network error + ✓ classify unknown error + + ────────────────────────────────────────── + + All 49 tests passed + + + Lib: daemon-dispatch Tests + ══════════════════════════════════════════ + + + daemon_track_job + ✓ Job added to active_jobs + ✓ Correct issue number + ✓ Correct PID + + daemon_spawn_pipeline +/tmp/mock-repo +/tmp/mock-repo +/tmp/mock-repo +/tmp/mock-repo +/tmp/mock-repo + ✓ Spawn attempted (track tested separately) + + daemon_spawn_pipeline disk check + ✓ No spawn when low disk + + daemon_reap_completed +✓ Pipeline completed for issue #50 (0s) +⚠ No previous recommendation found for issue #50 +✓ Recorded success outcome for issue #50 +▸ Cleaned worktree: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//lib-daemon-dispatch.HTXU1g/project/.worktrees/daemon-issue-50 + ✓ Reap removes job from active_jobs + ✓ Reap emits daemon.reap event + + Lib: daemon-failure Tests + ══════════════════════════════════════════ + + + classify_failure + ✓ No log file → unknown + ✓ Empty LOG_DIR → unknown + ✓ Auth error classified + ✓ 401 auth error classified + ✓ OAuth token error classified + ✓ Rate limit classified + ✓ 503 error classified + ✓ Socket hang up classified + ✓ ETIMEDOUT classified + ✓ Issue not found classified + ✓ Could not resolve classified + ✓ npm test failure classified + ✓ Test FAIL classified + ✓ Compile error classified + ✓ Context exhaustion classified + ✓ Context exhaustion with unknown tests + ✓ Generic failure → unknown + + get_max_retries_for_class + ✓ auth_error: 0 retries + ✓ invalid_issue: 0 retries + ✓ api_error: default 4 retries + ✓ context_exhaustion: 2 retries + ✓ build_failure: 2 retries + ✓ unknown: default 2 retries + ✓ Custom api_error retries + ✓ Custom default retries + + Failure tracking + ✓ Initial consecutive class is empty + ✓ Initial consecutive count is 0 + ✓ Same-class increments count + ✓ Different class resets count + ✓ Reset clears class + ✓ Reset clears count + + classify_failure edge cases + ✓ Auth error takes priority over build error + ✓ API error takes priority over build error + ✓ Empty log → unknown + + ────────────────────────────────────────── + + All 34 tests passed + + + Lib: daemon-poll Tests + ══════════════════════════════════════════ + + + daemon_health_check + ✓ daemon_health_check with no state file + ✓ daemon_health_check with empty active_jobs + ✓ daemon_health_check with dead PID skips job + + daemon_check_degradation + ✓ daemon_check_degradation with no events file + ✓ daemon_check_degradation with fewer than 5 events + ✓ daemon_check_degradation with high CFR + ✓ daemon_check_degradation with good success rate + + daemon_cleanup_stale + ✓ daemon_cleanup_stale when disabled + ✓ daemon_cleanup_stale runs + ✓ State file exists after cleanup + + Integration: state + cleanup + ✓ daemon_cleanup_stale prunes old completed entries + + Exponential Backoff in daemon-poll + ✓ Initial backoff set to 30 seconds + ✓ Backoff doubled correctly + ✓ Backoff doubled correctly + ✓ Backoff doubled correctly + ✓ Backoff doubled correctly + ✓ Backoff ceiling enforced at 300 + ✓ Backoff reset on success + + Backoff state preservation across poll cycles + ✓ After 1st failure: 30s backoff + ✓ After 2nd failure: 60s backoff + ✓ After 3rd failure: 120s backoff + + Backoff uninitialized variable handling + ✓ Uninitialized BACKOFF_SECS defaults to 0 + ✓ First poll failure initializes to 30 + + Integration: Backoff + rate limiting + ✓ 1st failure: backoff=30s, GH_failures=1 + ✓ 2nd failure: backoff still 30s, GH_failures=2 + ✓ 3rd failure: exponential backoff doubles to 60s + ✓ Success resets failures=0 + ✓ Success resets backoff=0 + + Backoff edge cases + ✓ Negative backoff reset to 30 + ✓ Overshooting backoff clamped to 300 + ✓ Integer arithmetic on backoff + + ────────────────────────────────────────── + + All 31 tests passed + + + Lib: daemon-state Tests + ══════════════════════════════════════════ + + + atomic_write_state + ✓ State file created + ✓ State file has correct content + ✓ State file overwritten + ✓ State file has new field + + locked_state_update + ✓ locked_state_update adds to queue + ✓ Queue contains issue 42 + ✓ Queue now has 2 items + ✓ locked_state_update with --arg + + init_state + ✓ init_state creates state file + ✓ State version is 1 + ✓ Config poll_interval is 60 + ✓ Config max_parallel is 2 + ✓ Config watch_label is shipwright + ✓ Active jobs initially empty + ✓ Queue initially empty + ✓ Completed initially empty + ✓ Re-init updates PID + + enqueue_issue / dequeue_next +▸ Queued issue 10 (at capacity) +▸ Queued issue 20 (at capacity) +▸ Queued issue 30 (at capacity) + ✓ 3 issues queued + ✓ Dequeue returns first item + ✓ Second dequeue returns 20 + ✓ Queue has 1 remaining + ✓ Third dequeue returns 30 + ✓ Empty queue returns empty +▸ Queued issue 42 (at capacity) +▸ Queued issue 42 (at capacity) + ✓ Duplicate enqueue deduplicated + + daemon_is_inflight + ✓ Issue 42 not inflight initially +▸ Queued issue 42 (at capacity) + ✓ Issue 42 inflight after enqueue + ✓ Issue 99 inflight (active job) + ✓ Issue 777 not inflight + ✓ No state file → not inflight + + get_active_count + ✓ No state file → 0 active + ✓ Empty state → 0 active + ✓ 1 active job + ✓ 2 active jobs + ✓ locked_get_active_count matches + + update_state_field + ✓ Field updated + ✓ Custom field added + + daemon_log + ✓ Log has message + ✓ Log has INFO level + ✓ Log has timestamp + + GitHub rate limit circuit breaker + ✓ Not rate limited initially + ✓ Success resets failure count + ✓ 2 failures not yet limited +⚠ GitHub rate-limit circuit breaker: backing off 30s after 3 failures + ✓ 3 consecutive failures + ✓ Backoff set after 3 failures + ✓ Success resets failures + ✓ Success resets backoff + + is_priority_issue + ✓ urgent label is priority + ✓ p0 label is priority + ✓ non-priority labels not flagged + + Priority lane tracking + ✓ Initially 0 priority jobs + ✓ 1 priority job after track + ✓ Duplicate track still 1 + ✓ 2 priority jobs + ✓ 1 priority job after untrack + ✓ 0 priority jobs after all untracked - ▸ Preflight passes with all mocks... ✓ - ▸ Preflight fails when sw-loop.sh missing... ✓ - ▸ Start requires --goal or --issue... ✓ - ▸ Intake with --goal creates branch + artifacts... ✓ - ▸ Intake with --issue fetches from GitHub... ✓ - ▸ Plan generates plan.md, dod.md, tasks... ✓ - ▸ Build invokes sw loop and commits... ✓ - ▸ Test stage captures results to log... ✓ - ▸ Review generates report with severities... ✓ - ▸ PR stage creates PR URL artifact... ✓ - ▸ Full E2E pipeline (6 stages)... ✓ - ▸ Resume continues from partial state... ✓ - ▸ Abort marks pipeline as aborted... ✓ - ▸ Dry run shows config, no artifacts... ✓ - ▸ Self-healing build→test retry loop... ✓ + claim_issue (NO_GITHUB mode) + ✓ claim_issue succeeds in NO_GITHUB mode + ✓ release_claim succeeds in NO_GITHUB mode -━━━ Results ━━━ - Passed: 15 - Failed: 0 - Total: 15 + ────────────────────────────────────────── -All 15 tests passed! + All 57 tests passed -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright daemon test — Unit Tests (Synthetic Events) ║ -╚═══════════════════════════════════════════════════════════════════╝ + Lib: daemon-triage Tests + ══════════════════════════════════════════ -Setting up test environment... -✓ Environment ready: /tmp/sw-daemon-test.G3riH7 - ▸ dora_grade deploy_freq Elite (>= 7)... ✓ - ▸ dora_grade deploy_freq High (>= 1)... ✓ - ▸ dora_grade deploy_freq Medium (>= 0.25)... ✓ - ▸ dora_grade deploy_freq Low (< 0.25)... ✓ - ▸ dora_grade CFR thresholds (Elite/High/Medium/Low)... ✓ - ▸ Stage timings filter-first jq query... ✓ - ▸ MTTR pairs failures with next success... ✓ - ▸ epoch_to_iso helper function... ✓ - ▸ Health check detects stale jobs... ✓ - ▸ Priority label sorting... ✓ - ▸ Degradation alert triggers on high CFR... ✓ - ▸ Metrics --json output with cycle_time keys... ✓ - ▸ Self-labeling includes watch_label when enabled... ✓ - ▸ Self-labeling excludes watch_label when disabled... ✓ - ▸ Patrol recurring failures label construction... ✓ - ▸ DORA degradation event detection... ✓ - ▸ Retry exhaustion event detection... ✓ - ▸ Untested script detection logic... ✓ + extract_issue_dependencies + ✓ Extracts depends-on refs + ✓ Extracts second depends-on ref + ✓ Extracts blocked-by ref + ✓ Extracts after ref + ✓ No deps returns empty + ✓ Deduplicates refs -━━━ Results ━━━ - Passed: 18 - Failed: 0 - Total: 18 + triage_score_issue — Priority + ✓ Urgent issue scores high + ✓ High priority scores > 15 + ✓ Low priority >= no labels -All 18 tests passed! + triage_score_issue — Complexity + ✓ Simple task scores higher than complex (simple=35, complex=25) + triage_score_issue — Type + ✓ Security score > 0 + ✓ Bug score > 0 -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright prep test — Validation Suite (Real Subprocess) ║ -╚═══════════════════════════════════════════════════════════════════╝ + triage_score_issue — Score bounds + ✓ Score within 0-100 bounds (75) -Setting up test environment... -✓ Environment ready: /tmp/sw-prep-test.3ZRNVS - - ▸ Node.js project detection... ✓ - ▸ Python project detection... ✓ - ▸ Go project detection... ✓ - ▸ Rust project detection... ✓ - ▸ settings.json is valid JSON... ✓ - ▸ Hook scripts are executable... ✓ - ▸ Hook scripts have valid syntax... ✓ - ▸ CLAUDE.md has required sections... ✓ - ▸ Check mode outputs scoring... ✓ - ▸ Idempotency without --force... ✓ - ▸ --force overwrites modified files... ✓ - ▸ No eval in generated hooks... ✓ - ▸ Definition of Done generated... ✓ + select_pipeline_template + ✓ Hotfix label → hotfix template + ✓ Security label → enterprise template + ✓ Incident label → hotfix template + ✓ Very high score → minimal template + ✓ High score → fast template + ✓ Medium score → standard template + ✓ Low score → full template + ✓ Auto-template disabled → default -━━━ Results ━━━ - Passed: 13 - Failed: 0 - Total: 13 + select_pipeline_template — Quality memory + ✓ Critical quality findings → enterprise -All 13 tests passed! + select_pipeline_template — Template weights + ✓ Template weights → best template (fast) + ────────────────────────────────────────── -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright fleet test — Unit Tests ║ -╚═══════════════════════════════════════════════════════════════════╝ + All 23 tests passed -Setting up test environment... -✓ Environment ready: /tmp/sw-fleet-test.f4j7ea - ▸ Help output contains expected sections... ✓ - ▸ Help via --help flag... ✓ - ▸ Config parsing — valid config... ✓ - ▸ Config parsing — missing config file... ✓ - ▸ Config parsing — invalid JSON... ✓ - ▸ Config parsing — empty repos array... ✓ - ▸ Config defaults applied... ✓ - ▸ Fleet init generates config template... ✓ - ▸ Fleet init skips when config exists... ✓ - ▸ Fleet start spawns tmux sessions per repo... ✓ - ▸ Fleet start skips missing repos... ✓ - ▸ Fleet start skips existing sessions... ✓ - ▸ Fleet start creates fleet state file... ✓ - ▸ Fleet start emits fleet.started event... ✓ - ▸ Fleet start applies repo-level overrides... ✓ - ▸ Fleet stop kills sessions and cleans state... ✓ - ▸ Fleet stop — no fleet running... ✓ - ▸ Fleet stop emits fleet.stopped event... ✓ - ▸ Fleet status — no fleet running... ✓ - ▸ Fleet status shows dashboard... ✓ - ▸ Fleet metrics — no events file... ✓ - ▸ Fleet metrics dashboard output... ✓ - ▸ Fleet metrics JSON output... ✓ - ▸ Fleet metrics period flag... ✓ - ▸ Session name generation... ✓ - ▸ Fleet start skips non-git repos... ✓ - ▸ Unknown subcommand... ✓ + Lib: daemon-patrol Tests + ══════════════════════════════════════════ -━━━ Results ━━━ - Passed: 27 - Failed: 0 - Total: 27 -All 27 tests passed! + daemon_patrol_security_scan + ✓ daemon_patrol_security_scan runs without errors + + daemon_patrol_config_refresh + ✓ daemon_patrol_config_refresh loads valid config + ✓ daemon_patrol_config_refresh handles syntax errors gracefully + + daemon_patrol_worker_memory + ✓ daemon_patrol_worker_memory analyzes memory data + ✓ daemon_patrol_worker_memory handles missing data + + daemon_patrol_regression + ✓ daemon_patrol_regression detects metric changes + + daemon_patrol_auto_scale + ✓ daemon_patrol_auto_scale (80% CPU, 4GB RAM) + ✓ daemon_patrol_auto_scale (50% CPU, 1GB RAM) + ✓ daemon_patrol_auto_scale (50% CPU, 8GB RAM) + ✓ daemon_patrol_auto_scale (40% CPU, 6GB RAM) + + daemon_patrol_architecture_enforce + + + + + + ✓ daemon_patrol_architecture_enforce validates rules + + patrol_build_labels + ✓ patrol_build_labels without auto-watch + ✓ patrol_build_labels with auto-watch includes WATCH_LABEL + + Integration: Patrol signal emission + ✓ Patrol handles decision engine signal mode + + Edge cases and error handling + ✓ Handles missing configuration file + ✓ Handles empty metrics data + ✓ Patrol respects DRY_RUN flag + + ────────────────────────────────────────── + + All 17 tests passed + + + Lib: pipeline-detection Tests + ══════════════════════════════════════════ + + + detect_test_cmd + ✓ Node.js project defaults to npm test + ✓ pnpm lockfile detected + ✓ yarn lockfile detected + ✓ bun lockfile detected + ✓ package.json without test script returns empty + ✓ npm 'no test specified' returns empty + ✓ Python pyproject.toml with pytest + ✓ Python setup.py + tests dir + ✓ Rust project + ✓ Go project + ✓ Ruby with rspec + ✓ Ruby without rspec + ✓ Maven project + ✓ Gradle project + ✓ Gradle Kotlin DSL project + ✓ Makefile with test target + ✓ Empty project returns empty + + detect_project_lang + ✓ TypeScript detected + ✓ Next.js detected + ✓ React detected + ✓ Node.js detected + ✓ Rust lang detected + ✓ Go lang detected + ✓ Python detected + ✓ Ruby detected + ✓ Java detected + ✓ Unknown for empty project + + detect_task_type + ✓ Bug from 'fix' keyword + ✓ Bug from 'crash' keyword + ✓ Refactor keyword + ✓ Testing keyword + ✓ Security keyword + ✓ Docs keyword + ✓ DevOps keyword + ✓ Migration keyword + ✓ Architecture keyword + ✓ Feature default + + branch_prefix_for_type + ✓ Bug prefix + ✓ Refactor prefix + ✓ Testing prefix + ✓ Security prefix + ✓ Docs prefix + ✓ DevOps prefix + ✓ Migration prefix + ✓ Architecture prefix + ✓ Feature prefix (default) + ✓ Unknown type defaults to feat + + template_for_type + ✓ Bug template + ✓ Refactor template + ✓ Testing template + ✓ Security template + ✓ Docs template + ✓ DevOps template + ✓ Migration template + ✓ Architecture template + ✓ Feature template + ✓ Unknown template + + _detect_package_manager + ✓ No lockfile defaults to npm + ✓ pnpm-lock.yaml detected + ✓ bun.lockb detected + ✓ yarn.lock detected + + detect_test_commands + ✓ Single test script returns 1 command + ✓ Primary command is npm test + ✓ Heavyweight tests filtered: returns 3 commands + ✓ Primary command first + ✓ test:unit included + ✓ test:smoke included + ✓ test:e2e excluded (heavyweight) + ✓ test:integration excluded (heavyweight) + ✓ test:system excluded (heavyweight) + ✓ Subdirectory test runner discovered + ✓ Subdirectory with 'no test' excluded + ✓ Empty project returns empty + + ────────────────────────────────────────── + + All 73 tests passed + + + Lib: pipeline-intelligence Tests + ══════════════════════════════════════════ + + + classify_quality_findings + ✓ No findings defaults to correctness + ✓ Security findings route to security + ✓ Creates classified-findings.json + ✓ Security count > 0 + ✓ Performance findings route to performance + ✓ Style-only findings route to correctness + ✓ Architecture findings route to architecture + + pipeline_should_skip_stage + ✓ Stage intake correctly not skipped + ✓ Stage build correctly not skipped + ✓ Stage test correctly not skipped + ✓ Stage pr correctly not skipped + ✓ Stage merge correctly not skipped + ✓ Docs label skips compound_quality + ✓ Hotfix label skips compound_quality + ✓ Low complexity skips design + ✓ Reassessment skips compound_quality + ✓ Docs label skips review + ✓ Hotfix skips plan + + pipeline_adaptive_cycles + ✓ Base limit returned + ✓ Convergence extends limit + ✓ Divergence reduces limit + ✓ Learned model applied + ✓ Result within ceiling + + pipeline_verify_dod +✓ DoD verification: 100% pass rate (1/1 checks) + ✓ pipeline_verify_dod passes with no changed files + ✓ Creates dod-verification.json + ✓ Pass rate >= 70 +✓ DoD verification: 75% pass rate (3/4 checks) + ✓ pipeline_verify_dod with dod-audit + + pipeline_record_quality_score + ✓ Quality scores file created + ✓ One score recorded + ✓ Score has quality_score + ✓ Score has critical in findings + ✓ Score has repo + ✓ Second score appended + + pipeline_select_audits + ✓ Returns JSON with audit keys + ✓ Returns security + ✓ Returns dod + ✓ Off intensity returns all off + + pipeline_reassess_complexity + ✓ Reassessment returns valid assessment + ✓ Creates reassessment.json + + pipeline_security_source_scan + ✓ pipeline_security_source_scan scans source for vulnerabilities + ✓ pipeline_security_source_scan completes + ✓ pipeline_security_source_scan handles safe code + + pipeline_backtrack_to_stage + ✓ pipeline_backtrack_to_stage respects max backtrack limit + ✓ pipeline_backtrack_to_stage is defined + + compound_rebuild_with_feedback + ✓ compound_rebuild_with_feedback is defined + ✓ classify_quality_findings returns routing decision + + Integration: Full intelligence pipeline +✓ DoD verification: 75% pass rate (3/4 checks) +[] +{"adversarial":"off","architecture":"off","simulation":"off","security":"off","dod":"off"} + ✓ Integration created quality scores + ✓ Integration created dod verification + + Edge cases: Intelligence robustness +✓ DoD verification: 75% pass rate (3/4 checks) + ✓ pipeline_verify_dod handles missing BASE_BRANCH + ✓ classify_quality_findings handles corrupted JSON +[] + ✓ pipeline_security_source_scan handles large files +[] + ✓ pipeline_security_source_scan handles many vulnerabilities + + ────────────────────────────────────────── + + All 52 tests passed + + + Lib: pipeline-quality-checks Tests + ══════════════════════════════════════════ + + + run_test_coverage_check + ✓ No TEST_CMD returns skip + ✓ Extracts coverage from Jest/Istanbul format + ✓ Extracts coverage from coverage format + ✓ Failing test returns 0 + + run_bash_compat_check + ✓ No changed .sh files returns 0 + + run_new_function_test_check + ✓ No new functions in diff returns 0 + + run_atomic_write_check + ✓ No state/config changes returns 0 + + quality_check_coverage + ✓ quality_check_coverage passes when no test log + ✓ quality_check_coverage passes with coverage data + + quality_check_security + ✓ quality_check_security skips when no audit tool + ✓ Creates security-audit.log + ✓ Audit log has content + + quality_check_bundle_size + ✓ quality_check_bundle_size skips when no build dir + ✓ quality_check_bundle_size passes with build dir + ✓ Creates bundle-metrics.log + + quality_check_perf_regression + ✓ quality_check_perf_regression skips without test log + ✓ quality_check_perf_regression with duration + ────────────────────────────────────────── -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright fix test — Unit Tests ║ -╚═══════════════════════════════════════════════════════════════════╝ + All 17 tests passed + + + Lib: pipeline-stages Tests + ══════════════════════════════════════════ + + + show_stage_preview + ✓ Intake preview + ✓ Build preview + ✓ test_first preview + ✓ PR preview + + stage_intake +▸ Project: nodejs +▸ Issue #42: Add JWT auth +▸ Labels: feature +▸ Milestone: v2.0 +▸ Detected: feature → team template: feature-dev +✓ Branch: feat/add-jwt-auth-42 +▸ Skill analysis: LLM unavailable — using label-based classification + ✓ stage_intake completed + ✓ Goal set from issue + ✓ Branch created +▸ Project: nodejs +▸ Detected: feature → team template: feature-dev +✓ Branch: feat/add-rate-limiting +▸ Skill analysis: LLM unavailable — using label-based classification + ✓ Intake inline artifact + + stage_plan +▸ Generating implementation plan... +▸ Building context bundle for plan stage... +✓ Context bundle written to /Users/sethford/Documents/shipwright/.worktrees/pipeline-issue-122/.claude/pipeline-artifacts/context-bundle.md +▸ Plan saved: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//lib-pipeline-stages.WUgvR1/project/.claude/pipeline-artifacts/plan.md (12 lines) +▸ Task list: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//lib-pipeline-stages.WUgvR1/project/.claude/pipeline-tasks.md (2 tasks) +▸ Claude Code tasks: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//lib-pipeline-stages.WUgvR1/project/.claude/tasks.md +▸ Validating plan (attempt 1/2)... +⚠ Plan validation failed (attempt 1/2) +▸ Regenerating plan with validation feedback (mode: unknown)... +▸ Regenerated plan: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//lib-pipeline-stages.WUgvR1/project/.claude/pipeline-artifacts/plan.md (12 lines) +▸ Validating plan (attempt 2/2)... +⚠ Plan validation failed (attempt 2/2) +⚠ Plan validation did not pass after 2 attempts — proceeding anyway + ✓ Plan generated + ✓ Plan has checklist + ✓ Plan has steps + ✓ DoD extracted + ✓ Tasks file + + stage_build +[feat/add-rate-limiting 93c30ce] feat: add auth + 1 file changed, 1 insertion(+) + create mode 100644 src/auth.js +▸ Starting build loop: shipwright loop (max 20 iterations, 1 agent(s)) +[feat/add-rate-limiting 3b48171] feat: add auth +▸ Build produced 2 commit(s) + ✓ Build stage completes + ✓ Build produced source file + + stage_test +▸ Running tests: echo 'All 8 tests passed' +✓ Tests passed + ✓ Test log created + ✓ Test output captured +▸ Running tests: echo 'Statements : 85.5%' +✓ Tests passed + ✓ Coverage parsed +▸ Running tests: echo FAIL; exit 1 +FAIL + ✓ Stage test returns 1 on test failure + + stage_review +▸ Running AI code review... ( 1 file changed, 1 insertion(+)) +▸ Two-stage review: Pass 1 — Spec compliance +✓ Spec compliance: PASS +▸ Two-stage review: Pass 2 — Code quality +✓ Review clean + ✓ Review generated + ✓ Review has content + + stage_pr quality gate +[feat/empty eaaf669] artifacts + 20 files changed, 77 insertions(+) + create mode 100644 .claude/foo/bar + create mode 100644 .claude/pipeline-artifacts/.claude-tokens-build.log + create mode 100644 .claude/pipeline-artifacts/.claude-tokens-plan-validate.log + create mode 100644 .claude/pipeline-artifacts/.claude-tokens-plan.log + create mode 100644 .claude/pipeline-artifacts/.claude-tokens-review.log + create mode 100644 .claude/pipeline-artifacts/.claude-tokens-spec-review.log + create mode 100644 .claude/pipeline-artifacts/.plan-failure-sig.txt + create mode 100644 .claude/pipeline-artifacts/design.md + create mode 100644 .claude/pipeline-artifacts/dod.md + create mode 100644 .claude/pipeline-artifacts/intake.json + create mode 100644 .claude/pipeline-artifacts/plan-validation.md + create mode 100644 .claude/pipeline-artifacts/plan.md + create mode 100644 .claude/pipeline-artifacts/review-diff.patch + create mode 100644 .claude/pipeline-artifacts/review-spec.md + create mode 100644 .claude/pipeline-artifacts/review.md + create mode 100644 .claude/pipeline-artifacts/test-coverage.json + create mode 100644 .claude/pipeline-artifacts/test-results.log + create mode 100644 .claude/pipeline-tasks.md + create mode 100644 .claude/tasks.md +▸ Skipping PR stage — running in local/no-github mode + ✓ PR quality gate executed (rc=0) + + detect_task_type + ✓ Bug type + ✓ Refactor type + ✓ Feature type + + branch_prefix_for_type + ✓ Bug prefix + ✓ Feature prefix + + detect_project_lang + ✓ Project lang detected + + gh_get_issue_meta + ✓ Issue meta has title + ✓ Title parsed + + ────────────────────────────────────────── + + All 30 tests passed + + + Lib: pipeline-state Tests + ══════════════════════════════════════════ + + + save_artifact + ✓ Artifact created + ✓ Artifact content correct + ✓ JSON artifact is valid + + Stage status management + ✓ No status initially + ✓ Build status is running + ✓ Test status is pending + ✓ Build still running + ✓ Build updated to complete + ✓ Test still pending + + Stage timing + ✓ No timing initially + ✓ Timing shows duration + ✓ Build took ~65 seconds + ✓ In-progress stage timing (~10s) + ✓ Unknown stage → 0 seconds + + get_stage_description + ✓ intake description + ✓ plan description + ✓ build description + ✓ test description + ✓ review description + ✓ pr description + ✓ merge description + ✓ deploy description + ✓ monitor description + ✓ Unknown stage → empty + + verify_stage_artifacts +⚠ verify_stage_artifacts(plan): missing or empty: plan.md + ✓ Plan stage fails without plan.md + ✓ Plan stage passes with plan.md +⚠ verify_stage_artifacts(design): missing or empty: design.md + ✓ Design stage fails without design.md + ✓ Design stage passes with both artifacts + ✓ Build stage always passes (no artifacts) +⚠ verify_stage_artifacts(plan): missing or empty: plan.md + ✓ Empty plan.md fails verification + + Stage effectiveness tracking + ✓ Effectiveness file created + ✓ Has stage + ✓ Has outcome + ✓ Has timestamp + ✓ Hint for failed builds + ✓ Hint for failed plans + ✓ No hint when mostly successful + + log_stage + ✓ Log entry has stage + ✓ Log entry has message + ✓ Second log entry + + initialize_state + ✓ Pipeline status set to running + ✓ Started timestamp set + ✓ Stage statuses cleared + ✓ Log entries cleared + ✓ write_state called during init + + persist_artifacts + ✓ persist_artifacts is no-op outside CI + + ────────────────────────────────────────── + + All 46 tests passed + + + sw-adapters Tests + ══════════════════════════════════════════ + + + Adapter files exist and executable + ✓ adapter exists: tmux-adapter.sh + ✓ adapter executable: tmux-adapter.sh + ✓ adapter exists: iterm2-adapter.sh + ✓ adapter executable: iterm2-adapter.sh + ✓ adapter exists: wezterm-adapter.sh + ✓ adapter executable: wezterm-adapter.sh + ✓ adapter exists: docker-deploy.sh + ✓ adapter executable: docker-deploy.sh + ✓ adapter exists: fly-deploy.sh + ✓ adapter executable: fly-deploy.sh + ✓ adapter exists: vercel-deploy.sh + ✓ adapter executable: vercel-deploy.sh + ✓ adapter exists: railway-deploy.sh + ✓ adapter executable: railway-deploy.sh + + Adapters source cleanly in subshell + ✓ docker-deploy.sh sources in subshell + ✓ fly-deploy.sh sources in subshell + ✓ vercel-deploy.sh sources in subshell + ✓ railway-deploy.sh sources in subshell + ✓ tmux-adapter.sh sources in subshell + ✓ wezterm-adapter.sh sources in subshell + ✓ iterm2-adapter.sh sources in subshell + + Terminal adapters define expected functions + ✓ tmux defines spawn_agent + ✓ tmux defines list_agents + ✓ tmux defines kill_agent + ✓ tmux defines focus_agent + ✓ wezterm defines spawn_agent + ✓ wezterm defines list_agents + ✓ wezterm defines kill_agent + ✓ wezterm defines focus_agent + ✓ iterm2 defines spawn_agent + ✓ iterm2 defines list_agents + ✓ iterm2 defines kill_agent + ✓ iterm2 defines focus_agent + + tmux adapter key functions + ✓ tmux has pane creation helper (spawn_agent) + ✓ tmux uses pane map for stable IDs + + Deploy adapters export expected functions + ✓ docker defines detect_platform + ✓ docker defines get_staging_cmd + ✓ docker defines get_production_cmd + ✓ docker defines get_rollback_cmd + ✓ docker defines get_health_url + ✓ docker defines get_smoke_cmd + ✓ docker get_staging_cmd contains docker + ✓ docker get_production_cmd non-empty + ✓ docker get_rollback_cmd non-empty + ✓ fly defines detect_platform + ✓ fly defines get_staging_cmd + ✓ fly defines get_production_cmd + ✓ fly defines get_rollback_cmd + ✓ fly defines get_health_url + ✓ fly defines get_smoke_cmd + ✓ fly get_staging_cmd contains fly + ✓ fly get_production_cmd non-empty + ✓ fly get_rollback_cmd non-empty + ✓ vercel defines detect_platform + ✓ vercel defines get_staging_cmd + ✓ vercel defines get_production_cmd + ✓ vercel defines get_rollback_cmd + ✓ vercel defines get_health_url + ✓ vercel defines get_smoke_cmd + ✓ vercel get_staging_cmd contains vercel + ✓ vercel get_production_cmd non-empty + ✓ vercel get_rollback_cmd non-empty + ✓ railway defines detect_platform + ✓ railway defines get_staging_cmd + ✓ railway defines get_production_cmd + ✓ railway defines get_rollback_cmd + ✓ railway defines get_health_url + ✓ railway defines get_smoke_cmd + ✓ railway get_staging_cmd contains railway + ✓ railway get_production_cmd non-empty + ✓ railway get_rollback_cmd non-empty + + No hardcoded user paths in adapters + ✓ adapters have no hardcoded user paths + + ────────────────────────────────────────── + + All 72 tests passed + + + sw-evidence Tests + ══════════════════════════════════════════ + + + help + ✓ help shows usage + ✓ help mentions capture + ✓ help mentions verify + ✓ help mentions pre-pr + ✓ help exits 0 + ✓ -h shows usage + + types + ✓ types lists browser + ✓ types lists api + ✓ types lists cli + ✓ types lists database + + capture cli + ✓ capture runs collectors + ✓ cli-echo evidence file + ✓ cli-true evidence file + ✓ evidence has name + ✓ evidence has type + ✓ evidence has passed + ✓ evidence has captured_at + + manifest + ✓ manifest created + ✓ manifest has captured_at + ✓ manifest has collector_count + ✓ manifest has collectors + ✓ manifest is valid JSON + + verify + ✓ verify checks evidence + ✓ verify fails without manifest +▸ Capturing evidence (type: cli)... +▸ [cli] cli-echo: echo '{"status":"ok","version":"1.0"}' +✓ [cli] cli-echo: passed +▸ [cli] cli-true: true +✓ [cli] cli-true: passed + +✓ All 2 collector(s) passed + + verify artifact freshness +▸ Capturing evidence (type: cli)... +▸ [cli] cli-echo: true +✓ [cli] cli-echo: passed + +✓ All 1 collector(s) passed + ✓ verify reports stale evidence + + pre-pr + ✓ pre-pr runs capture + ✓ pre-pr runs verify + ✓ pre-pr creates manifest + + status + ✓ status shows manifest path + ✓ status shows collectors + + mutation testing + ✓ mutation capture runs + ✓ mutation evidence file + ✓ mutation has mutation_score + ✓ mutation has total_mutants + + property-based testing + ✓ property capture runs + ✓ property evidence file + ✓ property has passed_count + ✓ property has failed_count + ✓ property has total_iterations + + invariant checking + ✓ invariant capture runs + ✓ invariant evidence file + ✓ invariant has invariant_name + ✓ invariant has check_exit_code + + artifact capture + ✓ artifact capture runs + ✓ artifact stored + ✓ artifact manifest created + ✓ artifact manifest has name + ✓ artifact manifest has sha256 + + quality score computation +▸ Capturing evidence... +▸ [cli] cli-pass: true +✓ [cli] cli-pass: passed +▸ [mutation] mutation-simple: testing mutation coverage (threshold: 50%) +▸ [property] property-simple: running property test (5 iterations) +✓ [property] property-simple: passed +▸ [invariant] invariant-simple: checking invariant 'test-invariant' +✓ [invariant] invariant-simple: passed + +⚠ 3/4 passed, 1 failed + ✓ quality-score command runs + ✓ quality-score shows score + + list types with new collectors + ✓ types lists mutation + ✓ types lists property + ✓ types lists invariant + + ────────────────────────────────────────── + + All 53 tests passed + + +╔═══════════════════════════════════════════════════════════╗ +║ shipwright review-rerun — Test Suite (14 tests) ║ +╚═══════════════════════════════════════════════════════════╝ Setting up test environment... -✓ Environment ready: /tmp/sw-fix-test.XuOv7c - ▸ Help output contains expected sections... ✓ - ▸ Help via --help flag... ✓ - ▸ Missing goal shows error... ✓ - ▸ Missing repos shows error... ✓ - ▸ Arg parsing — --repos comma-separated... ✓ - ▸ Arg parsing — --repos-from file... ✓ - ▸ Arg parsing — --repos-from missing file... ✓ - ▸ Arg parsing — --pipeline template... ✓ - ▸ Arg parsing — --max-parallel... ✓ - ▸ Arg parsing — --branch-prefix... ✓ - ▸ Dry run shows what would happen... ✓ - ▸ Dry run does not create state file... ✓ - ▸ Fix status — no sessions... ✓ - ▸ Fix status shows existing sessions... ✓ - ▸ Invalid repo directory shows error... ✓ - ▸ Fix start creates state file... ✓ - ▸ Fix start emits events... ✓ - ▸ Fix start — summary output... ✓ - ▸ Branch name sanitization... ✓ - ▸ Fix header shows configuration... ✓ - ▸ Non-git repo warning... ✓ - ▸ Per-repo event tracking... ✓ +Core functions + ▸ Sources correctly and exports expected functions... ✓ + ▸ get_rerun_marker returns default when no policy... ✓ + ▸ get_rerun_marker reads from policy.json... ✓ -━━━ Results ━━━ - Passed: 22 - Failed: 0 - Total: 22 +SHA deduplication + ▸ rerun_already_requested returns true when marker+sha present... ✓ + ▸ rerun_already_requested returns false when no marker... ✓ + ▸ Different SHAs get different dedup results... ✓ + ▸ request_rerun skips when same SHA already requested... ▸ Rerun already requested for PR #1 at SHA abc1234 — skipping +✓ + ▸ request_rerun posts when SHA not yet requested... ✓ Rerun requested for PR #1 at SHA deadbee +✓ -All 22 tests passed! +Error handling + ▸ request_rerun handles missing PR number... ERROR: scripts/sw-review-rerun-test.sh:200 exited with status 1 +✓ + ▸ request_rerun handles missing SHA... ERROR: scripts/sw-review-rerun-test.sh:206 exited with status 1 +✓ + ▸ request_rerun handles gh failure gracefully... ERROR: scripts/sw-review-rerun-test.sh:241 exited with status 1 +✓ + ▸ check_rerun_state handles missing issue number... ERROR: scripts/sw-review-rerun-test.sh:247 exited with status 1 +✓ +Main entry + ▸ main help shows subcommands... ✓ + ▸ main unknown subcommand errors... ✓ -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright memory+cost test — Unit Tests for Memory & Cost ║ -╚═══════════════════════════════════════════════════════════════════╝ +════════════════════════════════════════════════════════ + All 14 tests passed ✓ +════════════════════════════════════════════════════════ -Setting up mock environment... -✓ Environment ready: /tmp/sw-memory-test.yiF1Rw - ▸ Memory capture from pipeline state... ✓ - ▸ Memory inject returns context for each stage... ✓ - ▸ Failure capture stores patterns... ✓ - ▸ Pattern detection identifies project type... ✓ - ▸ Cross-repo vs per-repo isolation... ✓ - ▸ Memory show displays dashboard... ✓ - ▸ Memory search finds matching entries... ✓ - ▸ Memory export produces valid JSON... ✓ - ▸ Memory forget clears repo memory... ✓ - ▸ Cost calculation for each model... ✓ - ▸ Cost recording writes to costs.json... ✓ - ▸ Budget set and check... ✓ - ▸ Cost dashboard runs without errors... ✓ - ▸ Cost JSON output is valid... ✓ - ▸ Actionable failures threshold filtering... ✓ - ▸ Actionable failures with no file returns []... ✓ - ▸ DORA baseline calculation from events... ✓ +╔═══════════════════════════════════════════════════════════╗ +║ shipwright tracker providers — Test Suite (26 tests) ║ +╚═══════════════════════════════════════════════════════════╝ + +Setting up test environment... + +GitHub Provider + ▸ GitHub sources correctly and exports expected functions... ✓ + ▸ provider_discover_issues calls gh issue list with correct args... ✓ + ▸ provider_get_issue calls gh issue view... ✓ + ▸ provider_create_issue calls gh issue create... ✓ + ▸ provider_discover_issues parses gh JSON output... ✓ + ▸ provider_get_issue parses gh JSON output... ✓ + ▸ provider_create_issue parses gh response... ✓ + ▸ NO_GITHUB=1 returns empty for discover... ✓ + ▸ provider_comment calls gh issue comment... ✓ + ▸ provider_close_issue calls gh issue close... ✓ + ▸ provider_get_issue_body calls gh issue view... ✓ + +Linear Provider + ▸ Linear sources correctly and exports expected functions... ✓ + ▸ provider_discover_issues uses curl to Linear GraphQL API... ✓ + ▸ provider_discover_issues parses Linear API response... ✓ + ▸ provider_get_issue uses curl... ✓ + ▸ provider_get_issue parses response... ✓ + ▸ Linear query construction includes team and issues... ✓ + +Jira Provider + ▸ Jira sources correctly and exports expected functions... ✓ + ▸ provider_discover_issues uses curl to Jira REST API... ✓ + ▸ Jira URL construction uses JIRA_BASE_URL and rest/api/3... ✓ + ▸ provider_discover_issues parses Jira response... ✓ + ▸ provider_get_issue uses issue/KEY endpoint... ✓ + ▸ provider_get_issue parses Jira response... ✓ + ▸ provider_create_issue uses POST to REST API... ✓ + ▸ provider_create_issue parses response... ✓ + ▸ provider_comment uses Jira comment API... ✓ + +════════════════════════════════════════════════════════ + All 26 tests passed ✓ +════════════════════════════════════════════════════════ + -━━━ Results ━━━ - Passed: 17 - Failed: 0 - Total: 17 +Shipwright Budget & Chaos Test Suite -All 17 tests passed! +1. Budget Configuration + ✓ Budget config parses correctly + ✓ Cost tracking reads total spent correctly (18.5) + ✓ Budget alert: 92.5% of daily limit used + ✓ Budget exceeded detection works +2. Cost CLI + ✓ sw cost CLI available -╔═══════════════════════════════════════════════════╗ -║ shipwright session — E2E Test Suite ║ -╚═══════════════════════════════════════════════════╝ +3. Chaos: Missing Files + ✓ Pipeline handles missing daemon state +▸ Sprint Summary for 2026-03-02 to 2026-03-09 -Setting up test environment... -Test tmux session: sw-test-1303977 +{ + "pipelines": 0, + "succeeded": 0, + "failed": 0, + "retries": 0, + "avg_duration": 0, + "avg_stages": 0, + "slowest_stage": "unknown", + "quality_score": 0 +} + ✓ Retro handles missing events file gracefully -Template Loading - ▸ Load feature-dev template (3 agents)... ✓ - ▸ Load exploration template (2 agents)... ✓ - ▸ Missing template returns error... ✓ - ▸ All 24 templates load successfully... ✓ +shipwright memory v3.2.4 — Persistent Learning & Context System -Window Creation - ▸ No-launch creates window without claude... ✓ - ▸ Duplicate window detection... ✓ - ▸ Window gets dark theme... ✓ - ▸ Auto-generated team name... ✓ +USAGE + shipwright memory [options] -Prompt & Launcher - ▸ Launcher script generation with template... ✓ - ▸ Output includes agents from template... ✓ - ▸ Output includes goal when provided... ✓ - ▸ No template + goal works... ✓ - ▸ No template + no goal = interactive... ✓ +COMMANDS + show Display memory for current repo + show --global Display cross-repo learnings + search Search memory for keyword + search --semantic Semantic search via memory_embeddings + forget --all Clear memory for current repo + export Export memory as JSON + import Import memory from JSON + stats Show memory size, age, hit rate -CLI & Configuration - ▸ Help flag... ✓ - ▸ Unknown option... ✓ - ▸ TMPDIR used for launcher files... ✓ +PIPELINE INTEGRATION + capture Capture pipeline learnings + inject Inject context for a stage + pattern [data] Record a codebase pattern + metric Update a performance baseline + decision Record a design decision + analyze-failure Analyze failure root cause via AI + fix-outcome Record fix effectiveness + ab-report Compare control vs treatment in A/B tests -Enhanced Features - ▸ Template auto-suggestion from goal... ✓ - ▸ No suggestion without goal... ✓ - ▸ Memory injection in prompt... ✓ - ▸ CLAUDE.md reminder in prompt... ✓ - ▸ Secure temp dir in source... ✓ +EXAMPLES + shipwright memory show # View repo memory + shipwright memory show --global # View cross-repo learnings + shipwright memory search "auth" # Find auth-related memories + shipwright memory export > backup.json # Export memory + shipwright memory import backup.json # Import memory + shipwright memory capture .claude/pipeline-state.md .claude/pipeline-artifacts + shipwright memory inject build # Get context for build stage + ✓ Memory handles missing memory dir (non-zero exit ok) -════════════════════════════════════════════════════ - All 21 tests passed ✓ -════════════════════════════════════════════════════ +shipwright cost v3.2.4 — Token Usage & Cost Intelligence +USAGE + shipwright cost [options] -╔═══════════════════════════════════════════════════╗ -║ shipwright init — E2E Test Suite ║ -╚═══════════════════════════════════════════════════╝ +COMMANDS + show Show cost summary for current period + show --period 30 Last 30 days + show --json JSON output + show --by-stage Breakdown by pipeline stage + show --by-issue Breakdown by issue + budget set Set daily budget (USD) + budget show Show current budget/usage -Setting up sandboxed environment... -Temp dir: /tmp/sw-init-test.JhXYsO +PIPELINE INTEGRATION + record [issue] Record token usage + record-outcome Record pipeline outcome + calculate Calculate cost (no record) + check-budget [estimated_usd] Check budget before starting -Configuration - ▸ Settings.json created with agent teams... ✓ - ▸ Settings merge preserves existing vars... ✓ - ▸ tmux.conf installed... ✓ - ▸ Overlay installed... ✓ +EFFICIENCY + efficiency Show cost/success efficiency metrics + efficiency --json JSON output -Templates - ▸ Team templates installed (>= 10)... ✓ - ▸ Pipeline templates installed (>= 5)... ✓ - ▸ Legacy templates path populated... ✓ +MODEL PRICING + update-pricing [model] [in] [out] Update model pricing + update-pricing Show current pricing + Current: opus $15.00/$75.00, sonnet $3.00/$15.00, haiku $0.25/$1.25 -Robustness - ▸ Idempotency — double init safe... ✓ - ▸ Doctor runs at end... ✓ - ▸ Help flag... ✓ +EXAMPLES + shipwright cost show # 7-day cost summary + shipwright cost show --period 30 --by-stage # 30-day breakdown by stage + shipwright cost budget set 50.00 # Set $50/day limit + shipwright cost budget show # Check current budget + shipwright cost efficiency # Cost per successful pipeline + shipwright cost update-pricing opus 15.00 75.00 # Update opus pricing + shipwright cost calculate 50000 10000 opus # Estimate cost + ✓ Cost handles missing costs file (non-zero exit ok) -Hook Wiring - ▸ JSONC stripped from settings.json... ✓ - ▸ Hooks wired into settings.json... ✓ - ▸ Hook wiring preserves existing hooks... ✓ - ▸ SessionStart hook installed... ✓ - ▸ Hook wiring with pre-existing settings... ✓ +4. Chaos: Corrupted Files -════════════════════════════════════════════════════ - All 15 tests passed ✓ -════════════════════════════════════════════════════ + Shipwright — Status Dashboard + 2026-03-09 07:36:48 + ══════════════════════════════════════════ + TMUX WINDOWS + ────────────────────────────────────────── + No Claude team windows found. + Start one with: shipwright session  -╔═══════════════════════════════════════════════════╗ -║ shipwright tracker — Test Suite ║ -╚═══════════════════════════════════════════════════╝ + TEAM CONFIGS ~/.claude/teams/ + ────────────────────────────────────────── + No team configs found. -Setting up test environment... + TASK LISTS ~/.claude/tasks/ + ────────────────────────────────────────── + No task lists found. -Provider Loading - ▸ Provider loads from config (linear)... ✓ - ▸ Provider loads jira... ✓ - ▸ Graceful skip when no provider... ✓ + DATABASE ~/.shipwright/shipwright.db + ────────────────────────────────────────── + ● SQLite v WAL=wal 0.3MB + events:0 runs:0 costs:0 -Pipeline Enrichment - ▸ Stage descriptions exist for all 12 stages... ✓ - ▸ Enriched progress body has Delivering line... ✓ - ▸ Enriched progress body has stage descriptions... ✓ - ▸ Pipeline state includes stage_progress... ✓ - ▸ Pipeline state includes stage description... ✓ + CONNECTED DEVELOPERS + ────────────────────────────────────────── + Dashboard not reachable (http://localhost:8767) -Integration - ▸ Tracker notify routes to provider (mock)... ✓ - ▸ Dashboard reads goal from pipeline state... ✓ - ▸ Jira config validation... ✓ - ▸ Linear config migration (legacy fallback)... ✓ + ────────────────────────────────────────── + No active teams. Start one: shipwright session  -════════════════════════════════════════════════════ - All 12 tests passed ✓ -════════════════════════════════════════════════════ + ✓ Status handles corrupted daemon state +shipwright activity — Live agent activity stream +USAGE + shipwright activity [subcommand] [options] -╔═══════════════════════════════════════════════════╗ -║ shipwright heartbeat + checkpoint — Test Suite ║ -╚═══════════════════════════════════════════════════╝ +SUBCOMMANDS + watch Live stream of agent activity (default) + snapshot Current state of all active agents + history [range] Replay past activity (e.g., '1h', '10m', 'all') + stats Running counters (events, commits, tests, agents) + agents List known agents and last activity + help Show this help message -Setting up test environment... +OPTIONS + --type Filter events by type (e.g., 'stage.completed') + --agent Filter by agent name + --team Filter by team + --stage Filter by pipeline stage (e.g., 'build') -Heartbeat Lifecycle - ▸ Write heartbeat creates JSON file... ✓ - ▸ Check heartbeat reports alive... ✓ Job test-job-alive alive (0s ago) -✓ - ▸ Check heartbeat reports stale... ⚠ Job test-job-stale stale (192940640s ago, timeout: 120s) -✓ - ▸ Clear heartbeat removes file... ✓ Cleared heartbeat for job: test-job-clear -✓ - ▸ List heartbeats returns JSON array... ✓ - ▸ Heartbeat update overwrites existing... ✓ - ▸ Check missing heartbeat returns error... ✓ - ▸ Heartbeat dir auto-created when missing... ✓ +EXAMPLES + \033[2mshipwright activity\033[0m # Live stream + \033[2mshipwright activity watch --type stage.*\033[0m # Only stage events + \033[2mshipwright activity history 1h\033[0m # Last hour + \033[2mshipwright activity snapshot\033[0m # Current state + \033[2mshipwright activity stats\033[0m # Counters + ✓ Activity handles corrupted events file -Checkpoint Lifecycle - ▸ Checkpoint save creates JSON file... ✓ Checkpoint saved for stage build (iteration 5) -✓ - ▸ Checkpoint restore outputs JSON... ✓ Checkpoint saved for stage test (iteration 3) -✓ - ▸ Checkpoint restore missing stage fails... ✓ - ▸ Checkpoint clear removes file... ✓ Checkpoint saved for stage review (iteration 1) -✓ Cleared checkpoint for stage review -✓ - ▸ Checkpoint clear --all removes all... ✓ Checkpoint saved for stage build (iteration 1) -✓ Checkpoint saved for stage test (iteration 2) -✓ Cleared 2 checkpoint(s) -✓ - ▸ Checkpoint save with files-modified... ✓ Checkpoint saved for stage build (iteration 7) -✓ +shipwright cost v3.2.4 — Token Usage & Cost Intelligence -Integration - ▸ Pipeline script has heartbeat functions... ✓ - ▸ Loop script has heartbeat and checkpoint... ✓ - ▸ Pipeline has human intervention checks... ✓ +USAGE + shipwright cost [options] -════════════════════════════════════════════════════ - All 17 tests passed ✓ -════════════════════════════════════════════════════ +COMMANDS + show Show cost summary for current period + show --period 30 Last 30 days + show --json JSON output + show --by-stage Breakdown by pipeline stage + show --by-issue Breakdown by issue + budget set Set daily budget (USD) + budget show Show current budget/usage +PIPELINE INTEGRATION + record [issue] Record token usage + record-outcome Record pipeline outcome + calculate Calculate cost (no record) + check-budget [estimated_usd] Check budget before starting -╔═══════════════════════════════════════════════════╗ -║ shipwright remote — Test Suite ║ -╚═══════════════════════════════════════════════════╝ +EFFICIENCY + efficiency Show cost/success efficiency metrics + efficiency --json JSON output -Setting up test environment... +MODEL PRICING + update-pricing [model] [in] [out] Update model pricing + update-pricing Show current pricing + Current: opus $15.00/$75.00, sonnet $3.00/$15.00, haiku $0.25/$1.25 -Machine Registry - ▸ Add machine creates machines.json... ▸ Checking shipwright installation at /tmp/sw-remote-test.Qm0tkC/mock-install... -✓ Shipwright found at /tmp/sw-remote-test.Qm0tkC/mock-install -✓ Registered machine: builder-1 (localhost, worker, 4 workers) -✓ - ▸ Add second machine appends to array... ▸ Checking shipwright installation at /tmp/sw-remote-test.Qm0tkC/mock-install... -✓ Shipwright found at /tmp/sw-remote-test.Qm0tkC/mock-install -✓ Registered machine: builder-2 (localhost, worker, 8 workers) -✓ - ▸ Remove machine by name... ✓ Removed machine: builder-2 -✓ - ▸ List machines returns output... ✓ - ▸ machines.json uses atomic writes... ✓ - ▸ Duplicate machine name prevented... ▸ Use shipwright remote remove builder-1 first -✓ - ▸ Remote script has help command... ✓ +EXAMPLES + shipwright cost show # 7-day cost summary + shipwright cost show --period 30 --by-stage # 30-day breakdown by stage + shipwright cost budget set 50.00 # Set $50/day limit + shipwright cost budget show # Check current budget + shipwright cost efficiency # Cost per successful pipeline + shipwright cost update-pricing opus 15.00 75.00 # Update opus pricing + shipwright cost calculate 50000 10000 opus # Estimate cost + ✓ Cost handles corrupted costs file -CLI & Dashboard Integration - ▸ CLI router includes remote command... ✓ - ▸ CLI router includes heartbeat command... ✓ - ▸ CLI router includes checkpoint command... ✓ - ▸ Doctor has heartbeat health check... ✓ - ▸ Doctor has remote machine checks... ✓ - ▸ Status shows heartbeat section... ✓ - ▸ Status shows remote machines section... ✓ +shipwright memory v3.2.4 — Persistent Learning & Context System -════════════════════════════════════════════════════ - All 14 tests passed ✓ -════════════════════════════════════════════════════ +USAGE + shipwright memory [options] +COMMANDS + show Display memory for current repo + show --global Display cross-repo learnings + search Search memory for keyword + search --semantic Semantic search via memory_embeddings + forget --all Clear memory for current repo + export Export memory as JSON + import Import memory from JSON + stats Show memory size, age, hit rate -╔═══════════════════════════════════════════════════════════════════╗ -║ shipwright intelligence test — Unit Tests ║ -╚═══════════════════════════════════════════════════════════════════╝ +PIPELINE INTEGRATION + capture Capture pipeline learnings + inject Inject context for a stage + pattern [data] Record a codebase pattern + metric Update a performance baseline + decision Record a design decision + analyze-failure Analyze failure root cause via AI + fix-outcome Record fix effectiveness + ab-report Compare control vs treatment in A/B tests -Setting up test environment... -✓ Environment ready: /tmp/sw-intelligence-test.aRDezI - - ▸ analyze_issue returns valid schema... ✓ - ▸ Cache hit on second call with same input... ✓ - ▸ Graceful degradation when claude CLI unavailable... ✓ - ▸ compose_pipeline produces valid pipeline JSON... ✓ - ▸ recommend_model returns valid model names... ✓ - ▸ predict_cost returns numeric estimates... ✓ - ▸ Cache TTL expiry returns miss... ✓ - ▸ search_memory returns ranked results... ✓ - ▸ Feature flag disabled returns fallback... ✓ - ▸ Events emitted for analysis... ✓ - ▸ recommend_model emits events... ✓ - ▸ Cache init creates file if missing... ✓ +EXAMPLES + shipwright memory show # View repo memory + shipwright memory show --global # View cross-repo learnings + shipwright memory search "auth" # Find auth-related memories + shipwright memory export > backup.json # Export memory + shipwright memory import backup.json # Import memory + shipwright memory capture .claude/pipeline-state.md .claude/pipeline-artifacts + shipwright memory inject build # Get context for build stage + ✓ Memory handles corrupted memory file -━━━ Results ━━━ - Passed: 12 - Failed: 0 - Total: 12 +5. Chaos: Large Files + ✓ Large events file (1000 lines) exists + ✓ Large outcomes file (500 lines) exists -All 12 tests passed! +6. Chaos: Concurrent Access + ✓ Concurrent event writes (10/10 survived) +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +Results: 16 passed / 0 failed / 16 total +ALL TESTS PASSED -╔═══════════════════════════════════════════════════╗ -║ shipwright pipeline composer — Test Suite ║ -╚═══════════════════════════════════════════════════╝ + Chaos Test Suite: Fault Injection & Recovery + ══════════════════════════════════════════ -Setting up test environment... -Pipeline Composition - ▸ Composed pipeline has valid stage ordering... ✓ - ▸ High-risk issue gets security stages... ▸ Composing pipeline with intelligence engine... -✓ Composed pipeline: 7 stages -.claude/pipeline-artifacts/composed-pipeline.json -✓ - ▸ Fallback to static template when no intelligence... ✓ + Chaos Test 1: Kill mid-state-write recovery + ✓ Stale tmp file detected + ✓ Recovered from mid-write state file + ✓ State file valid after recovery -Conditional Stage Insertion - ▸ Stage inserted at correct position after build... ✓ - ▸ Insert into nonexistent stage fails... ✓ + Chaos Test 2: Disk full error handling + ✓ Disk full error handled gracefully -Model Downgrade - ▸ Budget constraint triggers model downgrades... ✓ - ▸ Downgrade with nonexistent stage fails... ✓ + Chaos Test 3: Corrupted state recovery + ✓ State file corrupted as expected + ✓ State recreated from schema after corruption -Pipeline Validation - ▸ Validation accepts valid pipeline... ✓ - ▸ Validation rejects invalid ordering (test before build)... ✓ - ▸ Validation rejects missing stage ids... ✓ - ▸ Validation rejects missing stages array... ✓ + Chaos Test 4: Concurrent daemon prevention +▸ Queued issue 999 (at capacity) + ✓ Issue enqueued + ✓ First daemon gets issue + ✓ Second daemon gets nothing (already dequeued) + ✓ Concurrent daemon duplicate prevention works -Iteration Estimation - ▸ Iteration estimates are reasonable (1-50 range)... ✓ + Chaos Test 5: GitHub API timeout recovery + ✓ gh timeout detected + ✓ Timeout triggers backoff behavior -════════════════════════════════════════════════════ - All 12 tests passed ✓ -════════════════════════════════════════════════════ + Chaos Test 6: Corrupted event log resilience + ✓ Valid events parsed despite corruption + ✓ Event parser robust to corruption + Chaos Test 7: State file permission race + ✓ State file permission issue detected + ✓ State recovery from permission error -━━━ shipwright self-optimize tests ━━━ + Chaos Test 8: Large file operation interruption + ✓ Incomplete clone detected + ✓ Repo cleanup after partial clone - ▸ Outcome analysis extracts correct metrics... ✓ - ▸ Outcome analysis emits event... ✓ - ▸ Outcome analysis rejects missing file... ✓ - ▸ Template weight increases for high success... ✓ - ▸ Template weight decreases for low success... ✓ - ▸ A/B test selects ~20% sample... ✓ - ▸ Iteration model updates with data points... ✓ - ▸ Model routing tracks success rates... ✓ - ▸ Model routing keeps opus with few sonnet samples... ✓ - ▸ Memory pruning removes old patterns... ✓ - ▸ Memory strengthening boosts confirmed patterns... ✓ - ▸ Memory promotion copies cross-repo patterns... ✓ - ▸ Full analysis runs on empty data... ✓ - ▸ Report generates output with data... ✓ - ▸ Report handles empty outcomes... ✓ - ▸ Outcome analysis extracts stage data... ✓ + Chaos Test 9: Cascade failure recovery + ✓ State recovery succeeds despite cascade + ✓ Cascade failure recovery in place -━━━ Results ━━━ - Passed: 16 - Failed: 0 - Total: 16 + Chaos Test 10: Poison pill in active_jobs + ✓ Malformed job skipped, valid job processed + ✓ Poison pill handling robust -All 16 tests passed! + Chaos Test 11: Memory exhaustion detection + ✓ Large state loads successfully + ✓ Memory pressure handling in place + Chaos Test 12: Filesystem inconsistency recovery + ✓ Filesystem inconsistency recovery works -╔═══════════════════════════════════════════════════════════╗ -║ shipwright predictive test ║ -╚═══════════════════════════════════════════════════════════╝ + ────────────────────────────────────────── + All 25 tests passed -Risk Assessment - ▸ Risk returns valid schema with 0-100 range... ✓ - ▸ Risk elevates for complex keywords... ✓ - ▸ Risk emits event... ✓ -Anomaly Detection - ▸ Critical at 3x threshold (181 vs 60 baseline)... ✓ - ▸ Normal at 2x (119 vs 60 baseline)... ✓ - ▸ Warning between 2x and 3x (150 vs 60)... ✓ - ▸ Normal when no baseline exists... ✓ - ▸ Emits event for critical anomaly... ✓ +Shipwright Autonomous Loop E2E Test -Baseline Management - ▸ First data point uses raw value... ✓ - ▸ EMA calculation (0.9*100 + 0.1*200 = 110)... ✓ - ▸ Creates baseline file if missing... ✓ +1. Autonomous CLI + ✓ sw autonomous help lists subcommands -Preventative Injection - ▸ Injects context from matching patterns... ✓ - ▸ Returns empty for non-matching stage... ✓ +2. Daemon Coordination + ✓ Autonomous status command works without daemon + ✓ Daemon state file created for coordination test -Graceful Degradation - ▸ Works without intelligence engine... ✓ +3. Strategic Ingestion + ✓ Strategic events parsed correctly (found 2) + ✓ Autonomous state file is valid JSON -AI Patrol - ▸ AI patrol returns structured findings... ✓ +4. Self-Optimize Integration + ✓ sw self-optimize help works + ✓ Outcomes file exists for optimization + ✓ Retro JSON available for optimization ingest +▸ Quality score low (0) — boosting templates with stronger quality gates +✓ Retro ingested from retro-2026-02-10-to-2026-02-16.json + ✓ optimize ingest-retro command succeeds -════════════════════════════════════════════════════ - All 15 tests passed ✓ -════════════════════════════════════════════════════ +5. Triage Intelligence + ✓ sw triage help works + ✓ Triage system operational +6. Memory System + ✓ sw memory help works + ✓ Global memory file is valid and non-empty -╔═══════════════════════════════════════════════════╗ -║ shipwright frontier — Test Suite ║ -╚═══════════════════════════════════════════════════╝ +7. Discovery System + ✓ sw discovery help works + ✓ Discovery broadcast creates entries (1) -Setting up test environment... +8. Feedback System + ✓ sw feedback help lists subcommands -Adversarial Review - ▸ Adversarial review produces structured findings... ✓ - ▸ Adversarial iteration converges on no critical findings... ✓ +9. Oversight System + ✓ sw oversight help works -Developer Simulation - ▸ Simulation generates objections from 3 personas... ✓ - ▸ Simulation address returns action items... ✓ +10. Pipeline Integration + ✓ sw pipeline help works + ✓ Pipeline stages has oversight gate in merge stage (pending implementation) + ✓ Pipeline monitor stage has proactive feedback collection (pending implementation) -Architecture Enforcer - ▸ Architecture model has valid schema... ✓ - ▸ Architecture validates changes (violation detected)... ✓ +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +Results: 20 passed / 0 failed / 20 total +ALL TESTS PASSED -Graceful Degradation - ▸ All three degrade gracefully when claude unavailable... ✓ +Shipwright Memory & Discovery E2E Test -════════════════════════════════════════════════════ - All 7 tests passed ✓ -════════════════════════════════════════════════════ +1. Memory Lifecycle + ✓ Memory directory exists + ✓ Failure patterns stored (2 patterns) + ✓ Fix effectiveness rate tracked (1.0) + ✓ Memory injection provides known fixes + ✓ Patterns stored and deduplicated (2 unique) +2. Global Memory + ✓ Global learnings stored (3 lessons) + ✓ Global memory is searchable (1 matches for 'lint') -╔═══════════════════════════════════════════════════╗ -║ shipwright connect — Test Suite ║ -╚═══════════════════════════════════════════════════╝ +3. Discovery Broadcast & Inject + ✓ Discovery file has 4 entries + ✓ Discovery query by file pattern finds 2 matches + ✓ Discovery query by pipeline finds 2 entries + ✓ Discovery TTL filters expired entries (4 active) + ✓ Discovery injection produces readable context -Setting up test environment... +4. Cross-Pipeline Learning Flow + ✓ Cross-pipeline learning: Pipeline B finds A's discovery + ✓ Memory-Discovery chain: both systems have data -Identity Resolution - ▸ resolve_developer_id from DEVELOPER_ID env... ✓ - ▸ resolve_developer_id from git config... ✓ - ▸ resolve_developer_id fallback to USER... ✓ - ▸ resolve_machine_name from MACHINE_NAME env... ✓ - ▸ resolve_machine_name from hostname... ✓ +5. Optimization Feedback + ✓ Outcome tracking: 67% success rate + ✓ Template weights computable from outcomes -Dashboard URL Resolution - ▸ resolve_dashboard_url from --url flag... ✓ - ▸ resolve_dashboard_url from DASHBOARD_URL env... ✓ - ▸ resolve_dashboard_url from team-config.json... ✓ - ▸ resolve_dashboard_url falls back to default... ✓ +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +Results: 16 passed / 0 failed / 16 total +ALL TESTS PASSED -Start/Stop Lifecycle - ▸ cmd_start creates PID file... ▸ Starting connect to http://localhost:8767 -▸ Developer: test-developer @ test-machine -✓ Connect started (PID 1315235) - Logs: /tmp/sw-connect-test.UX0pCu/home/.shipwright/connect.log - Stop: shipwright connect stop -✓ - ▸ cmd_start rejects if already running... Stop it first: shipwright connect stop -✓ - ▸ cmd_stop removes PID file... ⚠ Process 99999 not running — cleaning up stale PID file -✓ - ▸ cmd_stop handles missing PID gracefully... ⚠ Connect is not running (no PID file) -✓ + shipwright policy e2e test + ══════════════════════════════════════════ -Status - ▸ cmd_status shows connected when PID alive... ✓ - ▸ cmd_status shows disconnected when no PID... ✓ + Policy File Validity + ✓ policy.json is valid JSON + ✓ policy has .daemon section + ✓ policy has .pipeline section + ✓ policy has .quality section + ✓ policy has .strategic section + ✓ policy has .sweep section + ✓ policy has .hygiene section + ✓ policy has .recruit section -Join Flow - ▸ cmd_join verifies token against dashboard... ✓ - ▸ cmd_join saves team-config.json... ✓ - ▸ cmd_join rejects invalid token... ✓ - ▸ cmd_join accepts --url and --token flags... ✓ + policy_get Function + ✓ policy_get reads pipeline.coverage_threshold_percent + ✓ policy_get reads daemon.poll_interval_seconds + ✓ policy_get reads hygiene.artifact_age_days + ✓ policy_get returns default for missing key + ✓ policy_get returns default from empty policy -Heartbeat & Disconnect Payloads - ▸ Heartbeat payload includes required fields... ✓ - ▸ Send disconnect sends proper payload... ✓ + Pipeline Quality Thresholds from Policy + ✓ pipeline-quality reads coverage threshold from policy + ✓ pipeline-quality reads gate threshold from policy + ✓ pipeline_quality_min_threshold returns policy value -Configuration & Utilities - ▸ ensure_dir creates shipwright directory... ✓ - ▸ now_iso returns valid ISO timestamp... ✓ - ▸ Script has correct version... ✓ + Daemon Policy Integration + ✓ daemon poll_interval from policy + ✓ daemon heartbeat_timeout from policy + ✓ daemon stage_timeouts.build from policy + ✓ daemon auto_scale_interval from policy -Integration - ▸ Help command shows all main commands... ✓ + Real Policy Values Sanity + ✓ daemon.poll_interval_seconds >= 10 + ✓ pipeline.coverage_threshold >= 1 + ✓ pipeline.quality_gate_score >= 1 + ✓ pipeline.max_iterations_default >= 1 + ✓ strategic.max_issues_per_cycle >= 1 + + HOME-Based Policy Fallback + ✓ policy_get falls back to HOME policy.json + + ────────────────────────────────────────── + + All 26 tests passed + +╔═══════════════════════════════════════════════════════════════════╗ +║ shipwright e2e smoke test — Pipeline Orchestration (No API) ║ +╚═══════════════════════════════════════════════════════════════════╝ + +Setting up mock environment... +✓ Environment ready: /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-e2e-smoke.0BgEOX + + ▸ Dry-run exits zero... ✓ + ▸ State file created after dry-run... ✓ + ▸ State file has required fields... ✓ + ▸ Fast template loads correctly... ✓ + ▸ All templates parse as valid JSON... ✓ + ▸ Stage ordering preserved in output... ✓ + ▸ CI mode sets flags... ✓ + ▸ Completed stages recognized... ✓ + ▸ No feature branches after dry-run... ✓ + ▸ Artifact directory created... ✓ + ▸ Pipeline help text... ✓ + ▸ Version consistency (pipeline vs daemon)... ✓ + ▸ Goal flag accepted... ✓ + ▸ Invalid template errors correctly... ✓ + ▸ Issue number in state... ✓ + ▸ Headless auto-detection (non-interactive stdin)... ✓ + ▸ Headless flag sets skip-gates... ✓ + ▸ Autonomous template all-auto gates... ✓ + ▸ Pipeline exit code default is 1 (failure)... ✓ + +━━━ Results ━━━ + Passed: 19 + Failed: 0 + Total: 19 + +All 19 tests passed! + + +Dashboard E2E Tests + +Setting up mock data... +Mock data written to /var/folders/57/0gs_mdl104q8vk054nz3wp3h0000gn/T//sw-e2e-mock.g4oGJj +Starting dashboard server... +Server running on http://localhost:18767 (PID: 16779) + +Page Loading + ▸ index.html serves Fleet Command page... ✓ + ▸ JavaScript bundle loads (dist/main.js)... ✓ + +API Endpoints + ▸ GET /api/health returns ok... ✓ + ▸ GET /api/status returns full FleetState shape... ✓ + ▸ FleetState has active pipelines (issue 142)... ✓ + ▸ FleetState daemon shows running... ✓ + ▸ FleetState has agent heartbeats... ✓ + ▸ FleetState has cost data (today_spent, daily_budget)... ✓ + ▸ FleetState has queued items... ✓ + ▸ GET /api/me returns user info... ✓ + ▸ GET /api/metrics/history returns metrics + dora_grades... ✓ + ▸ GET /api/timeline returns array... ✓ + ▸ GET /api/activity returns events... ✓ + ▸ GET /api/machines returns registered machines... ✓ + ▸ GET /api/alerts returns alerts array... ✓ + ▸ GET /api/daemon/config returns JSON... ✓ + ▸ GET /api/metrics/failure-heatmap returns heatmap... ✓ + ▸ GET /api/metrics/bottlenecks returns bottlenecks... ✓ + ▸ GET /api/metrics/stage-performance returns stages... ✓ + ▸ GET /api/predictions/142 returns ETA... ✓ + ▸ GET /api/logs/142 returns log content... ✓ + +New Endpoints (Phase 3-5) + ▸ GET /api/linear/status returns JSON... ✓ + ▸ GET /api/queue/detailed returns items... ✓ + ▸ GET /api/memory/global returns learnings... ✓ + ▸ GET /api/notifications/config returns config... ✓ + ▸ GET /api/approval-gates returns config... ✓ + ▸ GET /api/quality-gates returns rules... ✓ + ▸ GET /api/pipeline/142/diff returns diff... ✓ + ▸ GET /api/pipeline/142/files returns files... ✓ + ▸ GET /api/pipeline/142/reasoning returns reasoning... ✓ + ▸ GET /api/pipeline/142/failures returns failures... ✓ + ▸ GET /api/pipeline/142/quality returns quality... ✓ + ▸ GET /api/audit-log returns entries... ✓ + ▸ GET /api/rbac returns config... ✓ + ▸ GET /api/db/health returns status... ✓ + ▸ GET /api/db/events returns status... ✓ + +WebSocket + ▸ WebSocket connects and receives FleetState... ✓ ════════════════════════════════════════════════════ - All 25 tests passed ✓ + All 37 tests passed ✓ ════════════════════════════════════════════════════ diff --git a/.claude/platform-hygiene.json b/.claude/platform-hygiene.json index c480b9215..963f00852 100644 --- a/.claude/platform-hygiene.json +++ b/.claude/platform-hygiene.json @@ -1,5 +1,5 @@ { - "timestamp": "2026-03-09T10:59:38Z", + "timestamp": "2026-03-09T11:42:26Z", "repository": "pipeline-issue-122", "counts": { "hardcoded": 44, @@ -159,7 +159,7 @@ }, { "file": "scripts/sw-evidence.sh", - "line": 996 + "line": 1000 }, { "file": "scripts/sw-pipeline-composer.sh", @@ -617,7 +617,7 @@ }, { "script": "sw-pipeline-test.sh", - "lines": 1959 + "lines": 1963 }, { "script": "sw-db.sh", From 4a3594e9a8a9d978519864237da2ef2eac530bc8 Mon Sep 17 00:00:00 2001 From: Seth Ford Date: Mon, 9 Mar 2026 08:34:26 -0400 Subject: [PATCH 8/8] =?UTF-8?q?chore:=20pipeline=20cleanup=20=E2=80=94=20c?= =?UTF-8?q?ommit=20remaining=20build=20changes?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .claude/daemon-config.json | 2 +- .claude/intelligence-cache.json | 27 ++++++++++++++++++++++++++- .claude/loop-state.md | 17 +++++++++++------ .claude/platform-hygiene.json | 2 +- 4 files changed, 39 insertions(+), 9 deletions(-) diff --git a/.claude/daemon-config.json b/.claude/daemon-config.json index 6acee92ae..81c337d9f 100644 --- a/.claude/daemon-config.json +++ b/.claude/daemon-config.json @@ -16,7 +16,7 @@ "pipeline_template": "autonomous", "auto_template": false, "last_optimization": { - "timestamp": "2026-03-09T11:50:49Z", + "timestamp": "2026-03-09T12:06:32Z", "adjustments": "compound_quality enabled (CFR 25% > 20%); merge stage recommended (deploy freq 3.0/week); " } } diff --git a/.claude/intelligence-cache.json b/.claude/intelligence-cache.json index 70259e98b..57511f7b8 100644 --- a/.claude/intelligence-cache.json +++ b/.claude/intelligence-cache.json @@ -263,7 +263,7 @@ } } ], - "timestamp": 1773056933, + "timestamp": 1773059374, "ttl": 1800 }, "fdbb7bc340414cf938cecd4b790fa9e2": { @@ -647,6 +647,31 @@ }, "timestamp": 1772936833, "ttl": 3600 + }, + "293f94c5d80cb1596386597316f534d3": { + "result": { + "complexity": 5, + "risk_level": "medium", + "success_probability": 78, + "recommended_template": "fast", + "issue_type": "testing", + "key_risks": [ + "False positives on legitimate repeated work or transient failures", + "State hashing overhead in tight iteration loop", + "Integration with existing loop restart and error injection logic", + "Edge cases in file change detection and git status parsing" + ], + "implementation_hints": [ + "Hash (git status + changed files + last 3 lines of test output) per iteration using sha256sum", + "Track hashes in memory array with iteration metadata (number, timestamp, hash)", + "Implement early return before test execution when stall detected to save cost", + "Add --no-stall-detection flag and environment variable override", + "Test with mock iterations that produce identical hashes, different hashes, and mixed patterns", + "Write detailed stall reason to error-log.jsonl including hash sequence and iteration range" + ] + }, + "timestamp": 1773058311, + "ttl": 3600 } } } diff --git a/.claude/loop-state.md b/.claude/loop-state.md index ab354f1b6..44e8bcac0 100644 --- a/.claude/loop-state.md +++ b/.claude/loop-state.md @@ -52,17 +52,22 @@ Discoveries from other pipelines: [intake] Stage intake completed — Resolution: [design] Design completed for Build a production-grade todo application. TypeScript + React frontend with Vite, Express REST API backend, SQLite persistence with Drizzle ORM, JWT authentication (register/login), full CRUD for todos with filtering (all/active/completed), drag-and-drop reorder, due dates, priorities (low/medium/high), dark mode, responsive design. Include comprehensive test suite (unit + integration + e2e). Production-ready: error handling, input validation, rate limiting, CORS, environment config. — Resolution: [intake] Stage intake completed — Resolution: -[intake] Stage intake completed — Resolution: " -iteration: 1 +[intake] Stage intake completed — Resolution: + +## Failure Diagnosis (Iteration 2) +Classification: unknown +Strategy: retry_with_context +Repeat count: 0" +iteration: 2 max_iterations: 10 -status: running +status: complete test_cmd: "npm test" model: opus agents: 1 -started_at: 2026-03-09T11:11:31Z -last_iteration_at: 2026-03-09T11:11:31Z +started_at: 2026-03-09T11:53:15Z +last_iteration_at: 2026-03-09T11:53:15Z consecutive_failures: 0 -total_commits: 1 +total_commits: 2 audit_enabled: false audit_agent_enabled: false quality_gates_enabled: false diff --git a/.claude/platform-hygiene.json b/.claude/platform-hygiene.json index 963f00852..21a11a766 100644 --- a/.claude/platform-hygiene.json +++ b/.claude/platform-hygiene.json @@ -1,5 +1,5 @@ { - "timestamp": "2026-03-09T11:42:26Z", + "timestamp": "2026-03-09T12:19:28Z", "repository": "pipeline-issue-122", "counts": { "hardcoded": 44,