Skip to content

Releases: sethdford/shipwright

Shipwright v3.3.0

04 Apr 23:23

Choose a tag to compare

Highlights

Fully Autonomous Pipeline

  • Context exhaustion auto-recovery — detects Claude context limits, auto-restarts with progress briefing. Zero human intervention.
  • Config-driven model routing_smart_model() replaces all hardcoded model names. Reads from env → daemon-config → user-config → defaults.
  • Adaptive effort levels — per-stage effort from daemon-config.json with intelligent defaults.
  • Exponential backoff — replaces hardcoded sleep values in health checks and deployment polling.

Dark Factory Phases 1-8

  • Test-as-holdout validation, spec-driven development, causal graphs, auto-recovery
  • Process reward models, constitutional AI, formal specs, mutation testing
  • Cross-session RL with Thompson sampling bandits and policy learning
  • Spec-driven pipeline stages (spec_generation + spec_verification)

AutoResearch RL System

  • Reward aggregation from 80+ pipeline signals
  • Thompson Sampling for model/template selection
  • Policy learner with per-context strategy optimization
  • 26/26 E2E tests prove wiring

Pipeline Fixes

  • Fixed 58 pipeline tests (were failing, now all pass)
  • Fixed scope enforcement set -e propagation bug
  • Fixed mock claude flag handling

SOTA Research

  • 5 research docs (77KB), 65+ sources
  • 20-item prioritized backlog with 12-week roadmap

Upgrade

shipwright upgrade --apply

Or reinstall:

curl -fsSL https://raw.githubusercontent.com/sethdford/shipwright/main/install.sh | bash

Full Changelog

https://github.com/sethdford/shipwright/blob/main/CHANGELOG.md

What's Changed

  • feat: AI-powered skill injection for pipeline stages by @sethdford in #194
  • refactor: split 3 large lib modules into focused sub-modules by @sethdford in #219
  • refactor: create lib/bootstrap.sh, convert 5 scripts by @sethdford in #220
  • refactor: establish modular backend structure for dashboard by @sethdford in #221
  • refactor: decompose sw-pipeline.sh into 4 focused modules by @sethdford in #222
  • refactor: decompose sw-recruit.sh into 3 focused modules by @sethdford in #223
  • fix: stabilize test suites and fix pipeline intelligence skip by @sethdford in #235
  • Add Claude Code GitHub Workflow by @sethdford in #243
  • feat: pipeline quality revolution — 7 components closing 6 quality gaps by @sethdford in #248

Full Changelog: v3.2.0...v3.3.0

Shipwright v3.2.0

27 Feb 11:29

Choose a tag to compare

Shipwright v3.1.0

22 Feb 13:49

Choose a tag to compare

v3.0.0 — Full Architecture Overhaul

19 Feb 12:59

Choose a tag to compare

Shipwright 3.0.0

A ground-up architecture overhaul making Shipwright database-first, event-driven, and self-learning.

Highlights

  • Centralized Configuration — All ~200+ magic numbers extracted into config/defaults.json with 4-layer precedence (env var > daemon-config > policy > defaults)
  • SQLite as Source of Truth — Daemon state, heartbeats, costs, pipeline runs, and memory all read/write to SQLite first with file fallback
  • Unified Event System — 3 separate event stores consolidated into a single SQLite events table with consumer offset tracking and durable checkpoints
  • Thompson Sampling — Template selection uses Beta distribution sampling over historical success rates per complexity tier
  • UCB1 Model Routing — Balances exploration/exploitation for model selection across pipeline stages
  • Semantic Memory — Keyword-relevance search over stored memories for context injection into agent prompts
  • Reasoning Traces — Multi-step autonomous reasoning stored and queryable
  • Adaptive Thresholds — Quality and anomaly thresholds computed from historical distributions instead of hardcoded values
  • Real-time Event Streaming — New /ws/events WebSocket endpoint for live event monitoring
  • Dead Code Cleanup — Removed duplicate helpers and color definitions from 90+ scripts

New Files

File Purpose
config/defaults.json Central defaults for all tunables
config/event-schema.json Known event types and field validation
scripts/lib/config.sh Config reader with _config_get

Schema Changes

SQLite schema v6 with new tables: daemon_queue, event_consumers, durable_checkpoints, memory_patterns, memory_decisions, memory_embeddings, pipeline_outcomes, model_outcomes, reasoning_traces

Shipwright v2.4.0

17 Feb 11:59

Choose a tag to compare

Full Changelog: v2.3.1...v2.4.0

Full Changelog: v2.3.1...v2.4.0

v2.3.1 — Autonomous Feedback Loops, Testing Foundation, Chaos Resilience

17 Feb 02:25

Choose a tag to compare

What's New

Testing Foundation (211 new tests)

  • Vitest unit tests — 113 tests across state store, API client, router, WebSocket, design tokens, and icons
  • Server API tests — 46 endpoint tests for error handling, edge cases, and lifecycle operations
  • Autonomous E2E — 20 tests for daemon coordination, strategic ingestion, retro-optimize, oversight gates
  • Budget & chaos — 16 tests for budget limits, missing/corrupted files, large files, concurrent writes
  • Memory & discovery — 16 tests for failure patterns, fix effectiveness, discovery TTL, cross-pipeline learning

Feedback Loops Wired (Tier 1)

  • Production → Issues — Monitor stage always collects deploy logs, not just on error threshold
  • Retro → Self-optimize — Retrospective metrics automatically feed into template weight adjustments
  • Oversight → Merge — Oversight gate + approval gate mandatory before merge stage

Coordination Gaps Closed (Tier 2)

  • Autonomous ↔ Daemon — Detects running daemon, delegates via ready-to-build label instead of duplicate pipelines
  • Strategic → Autonomous — Strategic agent findings ingested, deduplicated, and fed into autonomous creation loop
  • AI-driven triage — Intelligence engine classification with --ai flag, falls back to keyword-based

Trust & Validation (Tier 3)

  • Long-running autonomous E2E test validates 100-cycle drift scenarios
  • Budget guard tests prove system stops at limits
  • Chaos tests cover missing files, corrupted JSON, GitHub 500s, rate limits

Full Changelog: v2.3.0...v2.3.1

Shipwright v2.3.0

17 Feb 01:58

Choose a tag to compare

Full Changelog: v2.2.2...v2.3.0

Full Changelog: v2.2.2...v2.3.0

Shipwright v2.2.2

16 Feb 21:18

Choose a tag to compare

Shipwright v2.2.1

16 Feb 20:19

Choose a tag to compare

Shipwright v2.2.0

16 Feb 20:07

Choose a tag to compare