Skip to content

refactor(tools): tighten platform tool schemas; storage-symmetric shape#127

Merged
mgoldsborough merged 2 commits intomainfrom
worktree-tool-schema-cleanup
Apr 29, 2026
Merged

refactor(tools): tighten platform tool schemas; storage-symmetric shape#127
mgoldsborough merged 2 commits intomainfrom
worktree-tool-schema-cleanup

Conversation

@mgoldsborough
Copy link
Copy Markdown
Contributor

Summary

The LLM-facing JSON Schema is a contract. A loose manifest: { type: "object" } invites the model to invent structure under-spec — saw this in production conv_30076c3681ad4c91 where Sonnet serialized the manifest as a JSON-string because the schema had no inner shape, then the validator rejected it, and the user saw "Couldn't create" with no signal what went wrong.

This PR applies four invariants across every platform tool that authors a persistent thing, plus a directory-scoped CLAUDE.md (symlinked as AGENTS.md) so future contributors author against the convention rather than retrofit.

Four invariants

  1. MCP-native — built via defineInProcessApp, surfaces through tools/list / tools/call byte-identical to a subprocess MCP server. Same wire format external clients (Claude Code, Cursor, Claude Desktop) consume.
  2. Strict input schemas — every object declares properties; every array declares items; identifiers use pattern; constrained strings use enum; bounded numbers use minimum/maximum. Locked by test/unit/tools/platform/schema-shape.test.ts.
  3. Storage-symmetric shape — create/update tools use { scope?, manifest, body } where manifest mirrors the on-disk metadata field-for-field and body is the content payload. No name at root, no flat-config-at-root.
  4. Minimum sufficient surface — operator/runtime-set fields (source, bundleName, allowedTools, requiresBundles, overrides, derivedFrom) live on the type and the on-disk file, not in the LLM-facing schema.

Per-tool changes

Tool Change
skills__create / update Full SkillManifest mirror; name moved into manifest; nested metadata for keywords/triggers/category/tags; dropped allowedTools, requiresBundles, loadingStrategy, appliesToTools, overrides, derivedFrom from LLM input
instructions__write_instructions Renamed textbody for cross-tool consistency
automations__create / update Restructured to {manifest, body}; promptbody; manifest holds schedule + run-time policy; dropped allowedTools, source, bundleName from input
files__create Restructured to {manifest, body}; base64_databody; mime_typemimeType

Deleted

  • src/skills/manifest-input.ts — loose-shape coercion accepting kebab/snake/camel variants. The schema is the contract; one casing wins; the validator enforces it.

Added

  • src/tools/platform/CLAUDE.md (with AGENTS.md symlink) — directive, ~200 lines, the four invariants and an "adding a new tool" checklist. Authoring rules for any future platform tool.
  • test/unit/tools/platform/schema-shape.test.ts — walks every source's tools/list, recursively asserts every object has properties and every array has items. Locks invariants (1) and (2) against future regression. New sources register in the test's SOURCES array; new tools on existing sources are auto-detected.

No backward compat

Clean break. Handlers cast directly to typed interface CreateInput/UpdateInput since the validator already enforced the schema upstream — no defensive re-validation, no kebab/snake/camel coercion, no flat-field plucking. The LLM rebuilds its tool understanding every conversation; there is no migration window to bridge.

Test plan

  • bun run verify green: 2129 unit, 213 web, 416 integration, 17 smoke
  • schema-shape.test.ts passes — locks (1) and (2) for skills, instructions, files, conversations
  • Existing skill / automation / file tests updated to new shape (no compat layer)
  • Manual: create a workspace skill via the chat agent — confirm the agent produces the new shape correctly with no validator rejection
  • Manual: create an automation via the chat agent — same
  • Manual: upload a file via files__create from the agent — same

Files

  • src/tools/platform/skills.ts — schema rewrite, handler simplification, types import
  • src/tools/platform/instructions.ts — text → body
  • src/tools/platform/files.ts — schema rewrite, handler refactor
  • src/bundles/automations/src/schemas.ts — full schema rewrite
  • src/bundles/automations/src/server.ts — handler rewrite, removed containsRecursiveTool (dead code; allowedTools no longer reachable from input)
  • src/skills/manifest-input.ts — DELETED
  • src/skills/index.ts — drop coercer export
  • src/tools/platform/CLAUDE.md — new convention doc
  • src/tools/platform/AGENTS.md — symlink to CLAUDE.md
  • test/unit/tools/platform/schema-shape.test.ts — new lint
  • test/**/*.test.ts — multiple test fixtures updated to new shape
  • web/src/pages/settings/components/WorkspaceInstructions.tsx — body field rename

The LLM-facing JSON Schema is a contract. A loose `manifest: { type:
"object" }` invites the model to invent structure under-spec — saw this
in production conv_30076c3681ad4c91 where Sonnet serialized the manifest
as a JSON-string because the schema had no inner shape.

Apply four invariants across every platform tool that authors a
persistent thing:

  1. MCP-native — built via `defineInProcessApp`, surfaces through
     `tools/list` and `tools/call` byte-identical to a subprocess MCP
     server. Same wire format external clients (Claude Code, Cursor)
     consume.
  2. Strict input schemas — every `object` declares `properties`; every
     `array` declares `items`; identifiers use `pattern`; constrained
     strings use `enum`; bounded numbers use `minimum`/`maximum`. Locked
     by test/unit/tools/platform/schema-shape.test.ts.
  3. Storage-symmetric shape — create/update tools use
     `{ scope?, manifest, body }` where `manifest` mirrors the on-disk
     metadata field-for-field and `body` is the content payload. No
     `name` at root.
  4. Minimum sufficient surface — operator/runtime-set fields
     (`source`, `bundleName`, `allowedTools`, `requiresBundles`,
     `overrides`, `derivedFrom`) live on the type, not in the LLM-
     facing schema.

Per-tool changes:

  - skills__create / update — full SkillManifest mirror; `name` moved
    into manifest; metadata sub-schema for keywords/triggers/category/
    tags. Dropped six advanced fields from the LLM input.
  - instructions__write_instructions — renamed `text` → `body` for
    cross-tool consistency.
  - automations__create / update — restructured to {manifest, body};
    `prompt` → `body`; manifest holds schedule + run-time policy. Dropped
    `allowedTools`, `source`, `bundleName` from input.
  - files__create — restructured to {manifest, body}; `base64_data` →
    `body`; `mime_type` → `mimeType` (camelCase consistency).

No backward compat — clean break. Handlers cast directly to typed
`interface CreateInput` since the validator already enforced the schema
upstream. Deleted `src/skills/manifest-input.ts` (loose-shape coercion
is no longer needed; the schema is the contract).

Adds `src/tools/platform/CLAUDE.md` (with `AGENTS.md` symlink) — directory-
scoped authoring rules for new platform tools. Lint test
`schema-shape.test.ts` walks every source's tools/list and asserts the
shape invariants automatically.

Verify: bun run verify green (2129 unit, 213 web, 416 integration, 17
smoke).
@mgoldsborough mgoldsborough force-pushed the worktree-tool-schema-cleanup branch from eac07f9 to 295013a Compare April 29, 2026 03:46
QA review of #127 found three silent-breakage regressions sharing one
root cause: the LLM-facing tool was treated as the only caller of the
automation domain. The CLI (`nb automation pause/resume`) and bundle
lifecycle were calling `automations__update` / `automations__create`
with the old flat shape; AJV with strict:false accepted the call but
the new handler reads `args.manifest`, never saw the fields, and
silently no-op'd. Bundle install path lost `source: "bundle"` and
`bundleName`, orphaning every bundle-contributed schedule on uninstall.

Architecture fix — extract a domain layer:

  src/bundles/automations/src/domain.ts (NEW)
    createAutomation(input, ctx)
    updateAutomation(name, patch, ctx)
    deleteAutomation(name, ctx)

  Accepts the full Automation shape including operator-only fields
  (`source`, `bundleName`, `allowedTools`, `ownerId`, `workspaceId`).

  src/bundles/automations/src/server.ts
    Tool handlers become thin schema-translators that delegate to the
    domain. LLM-facing path stamps `source: "agent"`, derives ownership
    from request context, and never reaches operator fields.

  src/runtime/runtime.ts
    `registerAutomationsContext(getter)` / `getAutomationsContext()` —
    the source factory registers a workspace-scoped context getter
    during construction; internal callers read it back to bypass the
    LLM-facing surface.

  src/cli/commands/automation.ts (issue 1)
    pause/resume call updateAutomation directly — no schema gymnastics.

  src/bundles/lifecycle.ts (issues 2+3)
    syncBundleAutomations / removeBundleAutomations call domain.
    `source: "bundle"` and `bundleName` round-trip cleanly. Uninstall
    finds and cleans up bundle-contributed schedules.

Plus follow-up issues from the same review:

  - test/unit/tools/platform/schema-shape.test.ts: register automations
    in SOURCES (issue 4)
  - executor.ts: recursion guard at the executor — the LLM can no longer
    set allowedTools, but operator file edits and bundle schedules still
    can; guard sees the merged Automation regardless of authoring path
    (issue 5)
  - validateAutomationFields(args: ValidatableAutomationFields) — typed
    signature replaces the synthetic-flat-record cast (issue 6)
  - skills__update + automations__update: separate
    *_UPDATE_MANIFEST_PROPERTIES omitting `name` so renames are rejected
    at schema validation rather than silently ignored (issue 7)
  - UPDATABLE_FIELDS now matches Automation type field order, comment
    explains why (issue 9)
  - CLAUDE.md § 1.4: "Internal callers use the domain API, not the tool
    handler" — codifies the convention so the next domain doesn't
    repeat this mistake

New test coverage:
  - test/unit/bundles/automations/domain.test.ts — pause/resume regression
  - executor.test.ts — recursion-guard rejects, permits non-recursive

Verify: bun run verify green (2151 unit, 213 web, 416 integration, 17
smoke).
@mgoldsborough
Copy link
Copy Markdown
Contributor Author

QA review adjudication — fixed in 263fb05

All 9 issues addressed. The three criticals shared one root cause: the LLM-facing tool was the only caller of the automation domain, so when we tightened its schema the CLI and lifecycle paths silently broke.

Architectural fix — domain API

Extracted src/bundles/automations/src/domain.ts exporting createAutomation / updateAutomation / deleteAutomation. Accepts the full Automation shape including operator fields (source, bundleName, allowedTools). The tool handlers become thin schema-translators that delegate to the domain and stamp source: "agent". Internal callers (CLI, lifecycle) bypass the LLM-facing surface and call the domain directly via a runtime-exposed getter.

CLAUDE.md § 1.4 codifies the rule: internal callers use the domain API, not the tool handler.

Per-issue resolution

# Status Fix
1 CLI pause/resume calls updateAutomation(name, { enabled }, ctx) directly. New domain.test.ts regression suite asserts enabled actually flips end-to-end.
2 lifecycle.ts::syncBundleAutomations calls createAutomation from domain with full shape.
3 Same call sets source: "bundle" + bundleName. removeBundleAutomations reads the store directly via ctx.definitions() and finds matching entries.
4 automations registered in schema-shape.test.ts SOURCES.
5 containsRecursiveTool restored, applied at the executor (sees the merged Automation regardless of authoring path). Three new executor tests cover reject-create / reject-update / permit-non-recursive.
6 validateAutomationFields(args: ValidatableAutomationFields) — typed signature, no synthetic flat-record cast.
7 SKILL_UPDATE_MANIFEST_PROPERTIES and AUTOMATION_UPDATE_MANIFEST_PROPERTIES derived via Omit<..., "name">. Schema rejects renames at validation.
8 Defer TODO comment unchanged — out of scope for this PR. Will spin a follow-up for the chat-multipart MIME allowlist.
9 UPDATABLE_FIELDS in domain.ts now matches Automation type declaration order; comment explains why.

Verify: 2151 unit (+9 new), 213 web, 416 integration, 17 smoke — all green.

@mgoldsborough mgoldsborough added the qa-reviewed QA review completed with no critical issues label Apr 29, 2026
@mgoldsborough mgoldsborough merged commit 7bbc041 into main Apr 29, 2026
4 checks passed
@mgoldsborough mgoldsborough deleted the worktree-tool-schema-cleanup branch April 29, 2026 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

qa-reviewed QA review completed with no critical issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant