Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,3 +69,32 @@ jobs:
- run: pip install pip-audit
- run: pip install -e .
- run: pip-audit --ignore-vuln CVE-2026-3219

api-surface:
# REQ-140 guard: regenerates the public CLI surface and fails the build
# if the live output drifts from the committed fixture. Catches accidental
# command additions / removals in PRs without forcing every contributor
# to remember to run `specsmith api-surface > tests/fixtures/api_surface.json`.
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
with:
python-version: "3.12"
cache: pip
- run: python -m pip install --upgrade pip
- run: pip install -e ".[dev]"
- name: Regenerate api_surface.json
env:
SPECSMITH_NO_AUTO_UPDATE: "1"
SPECSMITH_PYPI_CHECKED: "1"
PYTHONIOENCODING: utf-8
run: |
python -m specsmith.cli api-surface > /tmp/api_surface.live.json
- name: Diff against committed fixture
run: |
diff -u tests/fixtures/api_surface.json /tmp/api_surface.live.json || {
echo "::error::api_surface.json is stale. Regenerate via:"
echo " python -m specsmith.cli api-surface > tests/fixtures/api_surface.json"
exit 1
}
2 changes: 0 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -42,5 +42,3 @@ temp/
.env
.repo-index/

# Test-generated cloud spawn manifests
.specsmith/cloud/
3 changes: 3 additions & 0 deletions .specsmith/chat/chat-2026-04-30T23-28-32.jsonl
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{"role":"user","text":"audit","ts":"2026-04-30T23:28:47.744Z"}
{"role":"error","text":"[Apr 30, 07:28:53 PM] Agent process ended (signal SIGTERM) — send a message to restart","ts":"2026-04-30T23:28:53.486Z"}
{"role":"error","text":"specsmith not responding (tried: \"C:\\Users\\trist\\.specsmith\\venv\\Scripts\\specsmith.exe\").\nChoose Restart Session to retry, Open Settings to reinstall, or Reload Window if the problem persists.","ts":"2026-04-30T23:29:13.477Z"}
6 changes: 6 additions & 0 deletions .specsmith/chat/chat-2026-05-02T18-48-31.jsonl
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{"role":"error","text":"[May 2, 02:48:54 PM] Agent process ended (signal SIGTERM) — send a message to restart","ts":"2026-05-02T18:48:54.604Z"}
{"role":"error","text":"specsmith not responding (tried: \"C:\\Users\\trist\\.specsmith\\venv\\Scripts\\specsmith.exe\").\nChoose Restart Session to retry, Open Settings to reinstall, or Reload Window if the problem persists.","ts":"2026-05-02T18:49:14.608Z"}
{"role":"user","text":"audit","ts":"2026-05-02T18:49:44.470Z"}
{"role":"user","text":"audit","ts":"2026-05-02T18:49:49.650Z"}
{"role":"error","text":"specsmith not responding (tried: \"C:\\Users\\trist\\.specsmith\\venv\\Scripts\\specsmith.exe\").\nChoose Restart Session to retry, Open Settings to reinstall, or Reload Window if the problem persists.","ts":"2026-05-02T18:50:10.408Z"}
{"role":"error","text":"[May 2, 02:57:32 PM] Agent process ended (signal SIGTERM) — send a message to restart","ts":"2026-05-02T18:57:32.608Z"}
7 changes: 0 additions & 7 deletions .specsmith/requirements.json
Original file line number Diff line number Diff line change
Expand Up @@ -874,13 +874,6 @@
"source": "src/specsmith/cli.py, src/specsmith/agent/memory.py",
"status": "defined"
},
{
"id": "REQ-126",
"title": "Cloud Agent Stub Endpoint",
"description": "`specsmith cloud spawn <utterance> --endpoint <url>` packages working-tree + scaffold.yml + LEDGER.md as a tarball, POSTs to `<url>/spawn` with the utterance, and tails the returned JSONL stream URL. The contract is documented in `docs/site/cloud-agents.md`. The endpoint reference implementation is out of scope for 1.0 (documented as deferred).",
"source": "src/specsmith/cli.py, docs/site/cloud-agents.md",
"status": "defined"
},
{
"id": "REQ-127",
"title": "Onboarding Path Must Be Verified",
Expand Down
8 changes: 0 additions & 8 deletions .specsmith/runs/WI-NEXUS-006/pr-body.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,11 +56,3 @@ that envelope.
- WI-NEXUS-010: end-to-end documentation pass for the broker → preflight →
gated execution flow.

---

🤖 Generated with [Warp](https://app.warp.dev) — agent conversation:
[link](https://app.warp.dev/conversation/6f8aa790-049b-4ddf-9c52-4840728faee5)

Plan artifact: [Warp Agent Implementation Plan](https://app.warp.dev/drive/notebook/rfCwIZUgJPCakjJ2S552DX)

Co-Authored-By: Oz <oz-agent@warp.dev>
8 changes: 0 additions & 8 deletions .specsmith/runs/WI-NEXUS-015/pr-body.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,3 @@ follow-up work items, all governed by Specsmith and verified by pytest.
- The preflight ledger writer is best-effort — ledger errors never block
the CLI from emitting its JSON or returning its exit code.

---

🤖 Generated with [Warp](https://app.warp.dev) — agent conversation:
[link](https://app.warp.dev/conversation/6f8aa790-049b-4ddf-9c52-4840728faee5)

Plan artifact: [Warp Agent Implementation Plan](https://app.warp.dev/drive/notebook/rfCwIZUgJPCakjJ2S552DX)

Co-Authored-By: Oz <oz-agent@warp.dev>
8 changes: 0 additions & 8 deletions .specsmith/runs/WI-NEXUS-020/pr-body.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,3 @@ existing AEE epistemic infrastructure. **Suite: 259 passing, 1 skipped
- All new ledger writes are wrapped in `try/except` so ledger errors never
block the CLI.

---

🤖 Generated with [Warp](https://app.warp.dev) — agent conversation:
[link](https://app.warp.dev/conversation/6f8aa790-049b-4ddf-9c52-4840728faee5)

Plan artifact: [Warp Agent Implementation Plan](https://app.warp.dev/drive/notebook/rfCwIZUgJPCakjJ2S552DX)

Co-Authored-By: Oz <oz-agent@warp.dev>
4 changes: 0 additions & 4 deletions .specsmith/runs/WI-NEXUS-023/pr-body.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,3 @@ mypy src/specsmith/: Success: no issues found in 69 source files
gh dependabot/alerts: []
```

## Conversation + plan

- Conversation: https://app.warp.dev/conversation/6f8aa790-049b-4ddf-9c52-4840728faee5
- Plan: https://app.warp.dev/drive/notebook/rfCwIZUgJPCakjJ2S552DX
11 changes: 0 additions & 11 deletions .specsmith/testcases.json
Original file line number Diff line number Diff line change
Expand Up @@ -1374,17 +1374,6 @@
"expected_behavior": {},
"confidence": 1.0
},
{
"id": "TEST-126",
"title": "Cloud Spawn Documents Endpoint Contract",
"description": "`docs/site/cloud-agents.md` exists and documents the POST contract (`/spawn`, request body, response body, JSONL stream URL). `specsmith cloud spawn --help` shows the `--endpoint` flag.",
"requirement_id": "REQ-126",
"type": "unit",
"verification_method": "pytest",
"input": {},
"expected_behavior": {},
"confidence": 1.0
},
{
"id": "TEST-127",
"title": "Onboarding Doctor Has Required Checks",
Expand Down
14 changes: 7 additions & 7 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,28 +6,28 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]
### Removed
- **Cloud Runs feature retired.** `specsmith cloud spawn`, `specsmith cloud-serve`, `src/specsmith/cloud_serve.py`, `docs/site/cloud-agents.md`, the `.specsmith/cloud/` storage convention, and all related tests/fixtures have been removed. The deferred REQ-126/REQ-136 cloud-agent surface is no longer part of the 1.0 contract.
## [0.7.0] — 2026-04-30
### Added
- **`specsmith serve --auth-token` (REQ-137).** Optional bearer-token gate on every `/api/*` endpoint. `/api/health` stays open so liveness probes still work behind a load balancer that strips `Authorization`. New `make_server()` factory in `src/specsmith/serve.py` exposes a fully wired server for tests; `run_server()` adds the banner + `serve_forever` loop. `_Handler._authorize()` enforces `Authorization: Bearer <token>` on `do_GET`, `do_POST`, and `do_DELETE`.
- **`specsmith voice transcribe <wav>` (REQ-141).** New `src/specsmith/agent/voice.py` wraps the optional `whisper-cpp-python` extra. Three resolution modes: real (library + model file under `~/.specsmith/voice/` or `SPECSMITH_VOICE_MODEL`), stub (`SPECSMITH_VOICE_STUB=<text>` for tests/CI), or unavailable (raises `VoiceUnavailableError` with an actionable install hint). CLI exposes `voice transcribe --json` and `voice status`.
- **`specsmith cloud spawn <manifest> --endpoint --token --dry-run` (REQ-136).** Replaces the original REQ-126 stub. The new shape reads a YAML or JSON manifest, POSTs it to `<endpoint>/spawn`, and prints the response. `--token` adds bearer auth; `--dry-run` prints the would-be POST as JSON without leaving the host. Manifests must be mappings; lists / scalars exit 2 with a clear message.
- **`tests/test_warp_parity_followup.py`** — 20 new pytest cases covering: serve auth-gate (open `/api/health`, 401 on missing/wrong token, 200 on correct token), cloud spawn (dry-run JSON output, manifest type validation, 401 on missing token, persistence on success), voice (stub mode, missing-file error, unavailable-when-no-library + no-stub, status output), and the api-surface stability snapshot (matches fixture, required commands present, exit codes + event types frozen).
- **`tests/test_warp_parity_followup.py`** — covers serve auth-gate (open `/api/health`, 401 on missing/wrong token, 200 on correct token), voice (stub mode, missing-file error, unavailable-when-no-library + no-stub, status output), and the api-surface stability snapshot (matches fixture, required commands present, exit codes + event types frozen).
- **`docs/site/api-stability.md`** — documents the `api-surface` snapshot mechanism: payload shape, regeneration command, the required-command spot check, and what is *not* covered by the snapshot.
- **Specsmith Drive (REQ-133).** New `src/specsmith/drive.py` module exposes `push()`, `pull()`, `listing()`; mirrors project rules / workflows / notebooks under `~/.specsmith/drive/<project>/<kind>/`. Round-trip safe; default backend is filesystem-only so the user can `git push` themselves.
- **Per-block share / export (REQ-134).** New `src/specsmith/block_export.py` plus `specsmith chat-export-block --session-id <id> --block-id <id> [--format md|json|html]` slices a single block out of `.specsmith/sessions/<id>/events.jsonl` (fallback `turns.jsonl`) and emits a self-contained markdown / JSON / HTML snippet. Raises `FileNotFoundError` for missing sessions and `KeyError` for missing blocks; the CLI exits non-zero in either case.
- **AI-searchable history (REQ-135).** New `src/specsmith/history_search.py` adds a deterministic keyword `search()` over every `.specsmith/sessions/<id>/turns.jsonl` plus an optional `semantic=True` mode that uses `sentence-transformers` when available and silently falls back to keyword matching otherwise. New `[history-semantic]` extra in `pyproject.toml`.
- **Reference cloud-agent receiver (REQ-136).** New `src/specsmith/cloud_serve.py` ships a stdlib `HTTPServer` accepting `POST /spawn` (manifest JSON) and `GET /health`. Bearer-token auth + CIDR allowlist + a guardrail that refuses to bind non-loopback hosts without `--allow-cidr`. Persists each manifest under `~/.specsmith/cloud-runs/<run_id>/manifest.json`. Wired up as `specsmith cloud-serve --host --port --token --allow-cidr`.
- **`specsmith api-surface` (REQ-140).** Top-level command emits the frozen 1.0 public surface (`cli_commands`, `exit_codes`, `event_types`) as JSON; `--snapshot <path>` writes the same payload to disk for CI diffing.
- **`[voice]` optional extra (REQ-141).** Pyproject extra carrying `whisper-cpp-python` for the upcoming agent voice-input integration (not yet wired into the CLI).
- **`tests/test_warp_parity.py`** -- 20 new pytest cases covering the four new modules, the API-surface contract, and the CLI wiring (incl. localhost cloud-serve roundtrips, missing-token / wrong-token rejection, and the non-loopback guardrail).
- **`tests/test_warp_parity.py`** -- pytest cases covering the new drive / block-export / history-search modules, the API-surface contract, and the CLI wiring.

- **Real MCP JSON-RPC client (REQ-130).** `agent.mcp` now ships a full stdio client (`MCPSession`) that runs the official MCP handshake (`initialize` -> `notifications/initialized` -> `tools/list`) against any configured server, exposes each discovered tool as an `MCPTool` whose `invoke_with_safety()` runs every call through the supplied safety check. Protocol pinned at `2024-11-05`. The chat session header now reports tools-per-server counts.
- **`tests/fixtures/mcp_fake_server.py`** -- pure-Python stdio MCP server fixture for hermetic tests.
- **`tests/test_mcp_client.py`** -- 8 new pytest cases (handshake, protocol pin, idempotent close, text/error/unknown-tool, safety integration, crash recovery, loader silent-skip).

- **MCP server announcement in chat sessions (REQ-121).** When `.specsmith/mcp.yml` is present, `specsmith chat` now loads the configured servers via `agent.mcp.load_mcp_tools` and emits a `[mcp servers: <names>]` token at the top of the message block so consumers (and the user) see which external tool surfaces are in play. The Specsmith safety middleware still gates every call.
- **`specsmith notebook record --session-id <id>`** now reads `.specsmith/sessions/<id>/turns.jsonl` and embeds each turn as a `### <role>` section in the generated `docs/notebooks/<slug>.md`, alongside any `--work-item-id` artifacts. Both flags may be combined; either may be omitted (with a friendlier placeholder when neither is supplied). Closes the gap between TESTS.md TEST-123 and the existing implementation.
- **`tests/test_phase34_completion.py`** — 12 new pytest cases covering: MCP loader (config-missing, single entry, malformed entries dropped, unparseable yaml, MCPServerSpec round-trip), notebook record (session-turns capture, helpful placeholder), notebook replay (success + missing slug exit-code), `cloud spawn --dry-run` (manifest + tarball + `--help` documents `--endpoint`), and a stubbed `scripts/perf_smoke.py` smoke test that asserts the baseline.json schema without spawning real subprocesses.
- **`tests/test_phase34_completion.py`** — pytest cases covering: MCP loader (config-missing, single entry, malformed entries dropped, unparseable yaml, MCPServerSpec round-trip), notebook record (session-turns capture, helpful placeholder), notebook replay (success + missing slug exit-code), and a stubbed `scripts/perf_smoke.py` smoke test that asserts the baseline.json schema without spawning real subprocesses.

### Changed
- `specsmith chat` imports `load_mcp_tools` and emits the MCP-servers token after the rules-loaded notice.
Expand Down Expand Up @@ -468,7 +468,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- **`specsmith init --guided`**: interactive architecture definition with REQ/TEST stub generation.
- **Auditor**: 6 health checks (files, REQ↔TEST, ledger, governance size, tool config, consistency). `--fix` auto-repairs missing files and CI configs.
- **Domain-specific templates**: patent claims/spec/figures, legal contracts/regulatory, business exec-summary/financials, research citations/methodology, API endpoints/auth.
- **7 agent integrations**: AGENTS.md, Warp/Oz, Claude Code, Cursor, Copilot, Gemini, Windsurf, Aider.
- **7 agent integrations**: AGENTS.md, Claude Code, Cursor, Copilot, Gemini, Windsurf, Aider.
- **3 VCS platforms**: GitHub (`gh`), GitLab (`glab`), Bitbucket (`bb`) with CI/CD, dependency management (Dependabot/Renovate per ecosystem), and status checks.
- **Config inheritance**: `extends` field in scaffold.yml for org-level defaults.
- **Type-specific .gitignore**: Rust, Go, Node, Kotlin, .NET, KiCad, FPGA, Zephyr, LaTeX, Terraform patterns.
Expand Down Expand Up @@ -498,7 +498,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- **`specsmith diff`**: compare governance files against what spec templates would generate.
- **`audit --fix`**: auto-repair missing governance files and compress oversized ledgers.
- **Config inheritance**: `extends` field in scaffold.yml to inherit org-level defaults.
- **7 agent integration adapters**: Warp/Oz, Claude Code, Cursor, Copilot, Gemini, Windsurf, Aider.
- **7 agent integration adapters**: Claude Code, Cursor, Copilot, Gemini, Windsurf, Aider.
- **3 VCS platform integrations**: GitHub (`gh`), GitLab (`glab`), Bitbucket (`bb`) with CI/CD, dependency, and security config generation.
- **Domain-specific scaffold directories**: FPGA, Yocto, PCB, Embedded, Web, Rust, Go, C/C++, .NET, Mobile, DevOps, Data/ML, Microservices.
- **Branching strategy config**: gitflow, trunk-based, github-flow with tuning knobs.
Expand Down
6 changes: 3 additions & 3 deletions LEDGER.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ Extensive research and gap analysis session to bring specsmith architecture to f
- `docs/REQUIREMENTS.md` — 15 new requirement domains (OPS, CMD, MAS, ORC, FLG, LRN, EDD, MEM, HRK, SRV, RTR, LPR, MCP, SEC, IDE) with 60+ formal requirements
- `docs/ARCHITECTURE.md` — Added "Planned Architecture Evolution" section covering all new components, multi-agent patterns, eval design, and architecture invariants
- `AGENTS.md` — Added planned commands, planned file registry entries, updated tech stack
- Architecture plan document updated in Warp Oz with full gap analysis and 16-workstream roadmap
- Architecture plan document updated with full gap analysis and 16-workstream roadmap

### Open TODOs (Phase 1 — next immediate actions)

Expand Down Expand Up @@ -599,8 +599,8 @@ Phase 4: feature flags, instinct/learning, eval harness, agent memory, multi-age
- **Status**: complete
- **Chain hash**: `dd0115de0abeff8d...`

## 2026-04-28T09:05 — Nexus 1.0 roadmap groundwork landed (REQ-108..REQ-129): real verifier signal, JSONL chat block protocol (chat/notebook/cloud subcommands), persistent session memory, MCP loader, dynamic router, project-rules auto-injection, --predict-only and --comment flags, doctor --onboarding, perf smoke harness, e2e+unit tests, API-stability doc. Pre-1.0; no version bump.
- **Author**: oz
## 2026-04-28T09:05 — Nexus 1.0 roadmap groundwork landed (REQ-108..REQ-129): real verifier signal, JSONL chat block protocol (chat/notebook subcommands), persistent session memory, MCP loader, dynamic router, project-rules auto-injection, --predict-only and --comment flags, doctor --onboarding, perf smoke harness, e2e+unit tests, API-stability doc. Pre-1.0; no version bump.
- **Author**: specsmith-agent
- **Type**: feature
- **REQs affected**: REQ-108,REQ-109,REQ-110,REQ-111,REQ-112,REQ-113,REQ-114,REQ-115,REQ-116,REQ-117,REQ-118,REQ-119,REQ-120,REQ-121,REQ-122,REQ-123,REQ-124,REQ-125,REQ-126,REQ-127,REQ-128,REQ-129
- **Status**: complete
Expand Down
19 changes: 19 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,25 @@ specsmith treats belief systems like code: codable, testable, and deployable. It
epistemically-governed projects, stress-tests requirements as BeliefArtifacts, runs
cryptographically-sealed trace vaults, and orchestrates AI agents under formal AEE governance.

**0.10.0 — Multi-Agent + BYOE.** A `/plan` goes to the architect, `/fix`
goes to the coder, `/review` goes to a reviewer that runs on a different
model family. Each *profile* is a `(provider, model, endpoint?, fallback_chain)`
bundle stored in `~/.specsmith/agents.json`; an *activity routing table*
maps slash commands and AEE phases to profiles; **BYOE endpoints**
(`~/.specsmith/endpoints.json`) let you point a profile at any
OpenAI-v1-compatible backend you self-host (vLLM, llama.cpp `server`,
LM Studio, TGI, ...). Cross-family **diversity guard**, capability
filtering, transient-failure fallback chains, and TraceVault decision
seals on every `/agent` pin are wired in by default. See
[`docs/site/agents.md`](docs/site/agents.md) for the five-minute walkthrough.

```bash
specsmith agents preset apply default # frontier coder + cross-family reviewer
specsmith endpoints add --id home-vllm \
--base-url http://10.0.0.4:8000/v1 --auth bearer-keyring
specsmith run --agent opus-reviewer # one-shot per-session pin
```

It also co-installs the standalone `epistemic` Python library for direct use in any project:

```python
Expand Down
6 changes: 0 additions & 6 deletions REQUIREMENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -849,12 +849,6 @@
- **Description:** `specsmith chat` accepts `--parent-session <id>`. When set, the spawned session's `task_complete` event also writes a `sub_session_complete` event into the parent's session log so the parent's plan-block can surface child outcomes.
- **Source:** src/specsmith/cli.py, src/specsmith/agent/memory.py
- **Status:** defined
## 126. Cloud Agent Stub Endpoint
- **ID:** REQ-126
- **Title:** Cloud Agent Stub Endpoint
- **Description:** `specsmith cloud spawn <utterance> --endpoint <url>` packages working-tree + scaffold.yml + LEDGER.md as a tarball, POSTs to `<url>/spawn` with the utterance, and tails the returned JSONL stream URL. The contract is documented in `docs/site/cloud-agents.md`. The endpoint reference implementation is out of scope for 1.0 (documented as deferred).
- **Source:** src/specsmith/cli.py, docs/site/cloud-agents.md
- **Status:** defined
## 127. Onboarding Path Must Be Verified
- **ID:** REQ-127
- **Title:** Onboarding Path Must Be Verified
Expand Down
Loading
Loading