Releases · sethdford/shipwright

04 Apr 23:23

sethdford

v3.3.0

353cdc8

Shipwright v3.3.0 Latest

Latest

Highlights

Fully Autonomous Pipeline

Context exhaustion auto-recovery — detects Claude context limits, auto-restarts with progress briefing. Zero human intervention.
Config-driven model routing — _smart_model() replaces all hardcoded model names. Reads from env → daemon-config → user-config → defaults.
Adaptive effort levels — per-stage effort from daemon-config.json with intelligent defaults.
Exponential backoff — replaces hardcoded sleep values in health checks and deployment polling.

Dark Factory Phases 1-8

Test-as-holdout validation, spec-driven development, causal graphs, auto-recovery
Process reward models, constitutional AI, formal specs, mutation testing
Cross-session RL with Thompson sampling bandits and policy learning
Spec-driven pipeline stages (spec_generation + spec_verification)

AutoResearch RL System

Reward aggregation from 80+ pipeline signals
Thompson Sampling for model/template selection
Policy learner with per-context strategy optimization
26/26 E2E tests prove wiring

Pipeline Fixes

Fixed 58 pipeline tests (were failing, now all pass)
Fixed scope enforcement set -e propagation bug
Fixed mock claude flag handling

SOTA Research

5 research docs (77KB), 65+ sources
20-item prioritized backlog with 12-week roadmap

Upgrade

shipwright upgrade --apply

Or reinstall:

curl -fsSL https://raw.githubusercontent.com/sethdford/shipwright/main/install.sh | bash

Full Changelog

https://github.com/sethdford/shipwright/blob/main/CHANGELOG.md

What's Changed

feat: AI-powered skill injection for pipeline stages by @sethdford in #194
refactor: split 3 large lib modules into focused sub-modules by @sethdford in #219
refactor: create lib/bootstrap.sh, convert 5 scripts by @sethdford in #220
refactor: establish modular backend structure for dashboard by @sethdford in #221
refactor: decompose sw-pipeline.sh into 4 focused modules by @sethdford in #222
refactor: decompose sw-recruit.sh into 3 focused modules by @sethdford in #223
fix: stabilize test suites and fix pipeline intelligence skip by @sethdford in #235
Add Claude Code GitHub Workflow by @sethdford in #243
feat: pipeline quality revolution — 7 components closing 6 quality gaps by @sethdford in #248

Full Changelog: v3.2.0...v3.3.0

Contributors

sethdford

Assets 6

27 Feb 11:29

github-actions

v3.2.0

6c47a7e

Shipwright v3.2.0

Full Changelog: v3.1.0...v3.2.0

Assets 6

22 Feb 13:49

github-actions

v3.1.0

874e86a

Shipwright v3.1.0

Full Changelog: v3.0.0...v3.1.0

Assets 6

19 Feb 12:59

sethdford

v3.0.0

178bf51

v3.0.0 — Full Architecture Overhaul

Shipwright 3.0.0

A ground-up architecture overhaul making Shipwright database-first, event-driven, and self-learning.

Highlights

Centralized Configuration — All ~200+ magic numbers extracted into config/defaults.json with 4-layer precedence (env var > daemon-config > policy > defaults)
SQLite as Source of Truth — Daemon state, heartbeats, costs, pipeline runs, and memory all read/write to SQLite first with file fallback
Unified Event System — 3 separate event stores consolidated into a single SQLite events table with consumer offset tracking and durable checkpoints
Thompson Sampling — Template selection uses Beta distribution sampling over historical success rates per complexity tier
UCB1 Model Routing — Balances exploration/exploitation for model selection across pipeline stages
Semantic Memory — Keyword-relevance search over stored memories for context injection into agent prompts
Reasoning Traces — Multi-step autonomous reasoning stored and queryable
Adaptive Thresholds — Quality and anomaly thresholds computed from historical distributions instead of hardcoded values
Real-time Event Streaming — New /ws/events WebSocket endpoint for live event monitoring
Dead Code Cleanup — Removed duplicate helpers and color definitions from 90+ scripts

New Files

File	Purpose
config/defaults.json	Central defaults for all tunables
config/event-schema.json	Known event types and field validation
scripts/lib/config.sh	Config reader with _config_get

Schema Changes

SQLite schema v6 with new tables: daemon_queue, event_consumers, durable_checkpoints, memory_patterns, memory_decisions, memory_embeddings, pipeline_outcomes, model_outcomes, reasoning_traces

Assets 2

17 Feb 11:59

github-actions

v2.4.0

5580dd5

Shipwright v2.4.0

Full Changelog: v2.3.1...v2.4.0

Assets 6

17 Feb 02:25

github-actions

v2.3.1

001a920

v2.3.1 — Autonomous Feedback Loops, Testing Foundation, Chaos Resilience

What's New

Testing Foundation (211 new tests)

Vitest unit tests — 113 tests across state store, API client, router, WebSocket, design tokens, and icons
Server API tests — 46 endpoint tests for error handling, edge cases, and lifecycle operations
Autonomous E2E — 20 tests for daemon coordination, strategic ingestion, retro-optimize, oversight gates
Budget & chaos — 16 tests for budget limits, missing/corrupted files, large files, concurrent writes
Memory & discovery — 16 tests for failure patterns, fix effectiveness, discovery TTL, cross-pipeline learning

Feedback Loops Wired (Tier 1)

Production → Issues — Monitor stage always collects deploy logs, not just on error threshold
Retro → Self-optimize — Retrospective metrics automatically feed into template weight adjustments
Oversight → Merge — Oversight gate + approval gate mandatory before merge stage

Coordination Gaps Closed (Tier 2)

Autonomous ↔ Daemon — Detects running daemon, delegates via ready-to-build label instead of duplicate pipelines
Strategic → Autonomous — Strategic agent findings ingested, deduplicated, and fed into autonomous creation loop
AI-driven triage — Intelligence engine classification with --ai flag, falls back to keyword-based