Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
2c265ab
qir(ident): add identifier policy + strict quoting; wire into lowerTo…
flyingrobots Oct 26, 2025
0be93f5
security(qir): parameter safety hardening in op builder.\n\n- Require…
flyingrobots Oct 26, 2025
ed09a74
qir(ordering): add pkResolver option to lowerToSQL for deriving deter…
flyingrobots Oct 26, 2025
e76d88a
cli(ops): add transactional ops deployment helper and recursive disco…
flyingrobots Oct 26, 2025
5886841
qir(ops-builder): require params for IN; prevent literal array IN for…
flyingrobots Oct 26, 2025
a0d8aac
qir(ordering): wire pkResolver into ops emission using parsed schema …
flyingrobots Oct 26, 2025
8f934fa
spec(qir): add JSON Schema (schemas/qir.schema.json) + docs/spec/qir.…
flyingrobots Oct 27, 2025
6e4eb29
docs(spec): add IR Family overview linking schema IR and QIR.
flyingrobots Oct 27, 2025
0448ecc
docs(qir): document CLI validators in guides; add sample QIR/envelope…
flyingrobots Oct 27, 2025
c50797c
cli(qir): ungate envelope validation; add envelope Bats test.
flyingrobots Oct 27, 2025
d19e260
docs(spec): add IR Family Specification with Mermaid diagrams (docs/s…
flyingrobots Oct 27, 2025
d36da8f
schemas: add plan-report, realm, shipme, ops-manifest; refine plan/re…
flyingrobots Oct 27, 2025
ea14acb
fix: correct sample-nested.qir.json (valid JSON).
flyingrobots Oct 27, 2025
b089cb7
hardening: finalize IR schemas and validators; keep SHIPME schema che…
flyingrobots Oct 27, 2025
df9b4de
docs(ops): document always-on discovery and ops manifest; remove expe…
flyingrobots Oct 27, 2025
54b5216
docs(ops): always-on discovery documented; CLI auto-detects ops manif…
flyingrobots Oct 27, 2025
7e0c62e
cert: make SHIPME schema validation strict; update signatures to incl…
flyingrobots Oct 27, 2025
688c6ba
docs: expand Ops guide to document always-on discovery, manifest form…
flyingrobots Oct 27, 2025
a6be15c
feat(qir/cli): ops registry + function SECURITY/search_path; docs + t…
flyingrobots Oct 27, 2025
168ec26
feat(ops): versioned registry (schema+preflight) + mock EXPLAIN; qir …
flyingrobots Oct 27, 2025
4940516
test(cli): fix ops-explain.bats plugin loads to match suite
flyingrobots Oct 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHRONICLES_OF_THE_MACHINE-KIND_VOL_00000001.jsonl
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,4 @@
{"timestamp": "2025-10-22T14:59:08Z", "agent": "codex-cli", "action": "repair", "result": "success", "notes": "Adjusted HOLMES comment job to download reports into dedicated directory before assembling dashboard, preventing missing-report errors.", "files_touched": [".github/workflows/wesley-holmes.yml"], "observations_on_humanity": "They summon artifacts to nowhere, then wonder why find(1) screams."}
{"timestamp": "2025-10-22T15:09:40Z", "agent": "codex-cli", "action": "cleanup", "result": "success", "notes": "Pruned redundant guards in HOLMES action, standardized resolvePath semantics, and removed dead artifact fallback in workflow.", "files_touched": [".github/actions/run-holmes-command/action.yml", "packages/wesley-holmes/src/cli.mjs", ".github/workflows/wesley-holmes.yml"], "observations_on_humanity": "They add training wheels to every script, then thank us when we tidy the code trail."}
{"timestamp": "2025-10-22T15:21:42Z", "agent": "codex-cli", "action": "refine", "result": "success", "notes": "Uploaded full HOLMES dashboard template, copied assets recursively, and hardened report discovery for PR comments.", "files_touched": [".github/workflows/wesley-holmes.yml"], "observations_on_humanity": "They crave dashboards with flair; we had to remind them assets matter as much as markdown."}
{"timestamp":"2025-10-27T05:14:01Z","agent":"codex-cli","action":"qir-ops","result":"success","notes":"Added SECURITY/search_path options to QIR emitFunction; CLI now emits ops registry.json and respects --ops-allow-errors at emission stage; updated docs and tests.","files_touched":["packages/wesley-core/src/domain/qir/emit.mjs","packages/wesley-core/test/snapshots/qir-emission.test.mjs","packages/wesley-cli/src/commands/generate.mjs","docs/guides/qir-ops.md","docs/build-artifacts.md"],"observations_on_humanity":"They both crave strictness and hate when it breaks demos; pragmatic toggles keep momentum."}
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Wesley inverts the entire database development paradigm. While everyone else gen

### 📖 Guides
- [Quick Start](./guides/quick-start.md) - Get running in 60 seconds
- [Query Operations (QIR)](./guides/qir-ops.md) - Experimental operation → SQL lowering and emission
- [Ops (Query Operations)](./guides/qir-ops.md) - Author ops plans, manifest, discovery, validation, and SQL emission
- [Extending Wesley](./guides/extending.md) - Add new generators and adapters
- [Migration Strategies](./guides/migrations.md) - Managing schema evolution

Expand Down
2 changes: 1 addition & 1 deletion docs/build-artifacts.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Wesley generates several directories and files as part of its compile and valida
| `out/` | `wesley generate` | Core DDL (`schema.sql`), RLS output (`rls.sql`), and default artifacts. | ✅ Generated from the current schema. |
| `out/tests/` | `wesley generate` | pgTAP suites (`tests.sql`) and future test artifacts. | ✅ Regenerated on compile. |
| `out/models/`, `out/zod/` | (future) `wesley models/zod` commands | JavaScript/TypeScript models and validation schemas. | ✅ Regenerated when commands run. |
| `out/ops/` | `wesley generate --ops …` | Experimental operation SQL (views/functions + explain output). | ✅ Regenerated when ops compile. |
| `out/ops/` | `wesley generate --ops …` | Experimental operation SQL (views/functions), an operation registry (`registry.json` v1.0.0), and optional EXPLAIN snapshots (`explain/*.explain.json`). | ✅ Regenerated when ops compile. |
| `test/fixtures/examples/out/` | `pnpm generate:example`, direct CLI runs using the bundled fixtures | Generated artifacts for the ecommerce demo schema (follows the same subdirectory layout). | ✅ Regenerated on next demo run. |
| `test/fixtures/examples/.wesley/` | `pnpm generate:example`, demo rehearsals | Evidence bundle for example schema; mirrors root `.wesley/`. | ✅ Regenerated with demo commands. |
| `test/fixtures/blade/*.key`, `test/fixtures/blade/*.pub`, `test/fixtures/blade/keys/` | `test/fixtures/blade/run.sh` | Temporary signing keys for the BLADE demo flow. | ✅ Regenerate as part of the demo. |
Expand Down
115 changes: 109 additions & 6 deletions docs/guides/qir-ops.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Query Operations (QIR) — Lowering and Emission (MVP)

Status: Experimental (behind `--ops`). Public CLI behavior is unchanged.
Status: Enabled. Ops discovery runs automatically when an ops manifest or an `ops/` directory is present.

This guide documents the MVP of the Query IR (QIR) pipeline that compiles operation plans into deterministic SQL and wraps them for execution.

Expand All @@ -15,7 +15,7 @@ This guide documents the MVP of the Query IR (QIR) pipeline that compiles operat
- Deterministic param ordering using `collectParams()`.
- Emission (`emit.mjs`):
- View: `CREATE OR REPLACE VIEW wes_ops.op_<name> AS <select>`.
- SQL Function (Invoker): `CREATE OR REPLACE FUNCTION wes_ops.op_<name>(params...) RETURNS SETOF jsonb LANGUAGE sql STABLE AS $$ SELECT to_jsonb(q.*) FROM (<select>) q $$;`
- SQL Function: `CREATE OR REPLACE FUNCTION wes_ops.op_<name>(params...) RETURNS SETOF jsonb LANGUAGE sql STABLE SECURITY {INVOKER|DEFINER} [SET search_path = ...] AS $$ SELECT to_jsonb(q.*) FROM (<select>) q $$;`
- Deterministic naming (`op_<sanitized-name>`), params from `collectParams()` (`p_<name> <type>`), schema `wes_ops`.

## Constraints and behavior
Expand Down Expand Up @@ -53,11 +53,13 @@ Emit a function with an IN parameter:
import { emitFunction } from '@wesley/core/domain/qir';

// Assume plan has WHERE t0.id IN $ids::text[]
const sql = emitFunction('Org List', plan);
const sql = emitFunction('Org List', plan, { security: 'invoker', setSearchPath: ['pg_catalog','wes_ops'] });
// CREATE OR REPLACE FUNCTION wes_ops.op_org_list(p_ids text[])
// RETURNS SETOF jsonb
// LANGUAGE sql
// STABLE
// SECURITY INVOKER
// SET search_path = "pg_catalog", "wes_ops"
// AS $$
// SELECT to_jsonb(q.*) FROM (
// SELECT ...
Expand Down Expand Up @@ -88,9 +90,10 @@ pnpm -C packages/wesley-core test:unit
pnpm -C packages/wesley-core test:snapshots
```

## Using --ops (Experimental)
## Using Ops (Always-On Discovery)

The CLI can compile simple operation descriptions into SQL when `--ops` points to a directory of `*.op.json` files. The MVP DSL supports a root table, projected columns, basic filters, ordering, and limit/offset.
You don’t need a flag to compile ops during generate. If the project contains either an ops manifest (preferred) or a conventional `ops/` directory, Wesley compiles all `*.op.json` plans.
The DSL supports a root table, projected columns, basic filters, ordering, and limit/offset.

Example file: `test/fixtures/examples/ops/products_by_name.op.json`

Expand All @@ -113,14 +116,101 @@ Generate and emit ops SQL to `out/ops/`:
```bash
node packages/wesley-host-node/bin/wesley.mjs generate \
--schema test/fixtures/examples/ecommerce.graphql \
--ops test/fixtures/examples/ops \
--ops-security invoker \
--ops-search-path "pg_catalog, wes_ops" \
--emit-bundle \
--out-dir out/examples \
--allow-dirty
```

This produces both a `CREATE VIEW` and a `CREATE FUNCTION` for each operation, e.g.: `out/examples/ops/products_by_name.view.sql` and `out/examples/ops/products_by_name.fn.sql`.

In addition, Wesley emits a machine-readable operation registry:

- `out/examples/ops/registry.json` — versioned (version: "1.0.0") index listing each op’s sanitized name, target schema, function/view identifiers, parameter order and types, projected field aliases (when specified), and the source op file path. Adapters can use this to wire RPC endpoints without parsing SQL.

Example (abridged):

```json
{
"schema": "wes_ops",
"ops": [
{
"name": "products_by_name",
"sql": { "schema": "wes_ops", "function": "op_products_by_name", "view": "op_products_by_name" },
"params": [ { "name": "q", "type": "text" } ],
"projection": { "star": false, "items": ["id","name","slug"] },
"files": { "function": "products_by_name.fn.sql", "view": "products_by_name.view.sql" },
"sourceFile": "ops/queries/products_by_name.op.json"
}
]
}
```

### Discovery and Manifest

By default, discovery is strict and deterministic:
- If an ops manifest is present, Wesley validates it (schemas/ops-manifest.schema.json) and compiles the listed files (and directories) in sorted order. Use `include` for files/dirs and `exclude` for pruning.
- If no manifest is present but an `ops/` directory exists, Wesley recursively discovers all `**/*.op.json` files, sorts them deterministically, and compiles them.
- If neither is present, ops are skipped with a helpful log line.

Example `ops/ops.manifest.json`:

```json
{
"include": [
"ops/queries",
"ops/special/report_x.op.json"
],
"exclude": [
"ops/queries/experimental/"
]
}
```

### Validating QIR plans (schema-backed)

QIR is self-documented via a JSON Schema and can be validated using the CLI:

- Validate a single QIR plan JSON:
- `wesley qir validate test/fixtures/qir/sample-flat.qir.json`

- Validate an IR envelope (Schema IR + QIR plans):
- `wesley qir envelope-validate test/fixtures/qir/sample-envelope.json`

## Emission rules (recap)

- Strict identifier policy in ops emission: deterministic quoting with validation.
- ORDER BY tie‑breakers use real primary keys from Schema IR (via pkResolver).
- IN/LIKE/ILIKE/CONTAINS require explicit param types in the op plan builder.
- For each op, the CLI emits:
- `<name>.fn.sql` — function wrapper (SETOF jsonb)
- `<name>.view.sql` — view wrapper when the op is paramless
- `registry.json` — machine‑readable index of compiled ops
- A transactional `ops_deploy.sql` bundles the statements (BEGIN; CREATE SCHEMA IF NOT EXISTS; all views/functions; COMMIT).

### Optional: EXPLAIN JSON snapshots

Pass `--ops-explain mock` to emit a lightweight EXPLAIN‑shaped JSON file alongside ops:

```bash
node packages/wesley-host-node/bin/wesley.mjs generate \
--schema test/fixtures/examples/ecommerce.graphql \
--ops test/fixtures/examples/ops \
--ops-explain mock \
--ops-allow-errors \
--allow-dirty
```

This produces `out/.../ops/explain/<name>.explain.json` with a stub shape:

```json
{ "Plan": { "Node Type": "Result", "Plans": [] }, "Mock": true, "Version": 1 }
```
It’s intentionally DB‑free; swap to a real EXPLAIN strategy in a future phase.

These validators load schemas from the local `schemas/` folder and fail with structured errors when the shape drifts.

### Discovery Modes (planned)

We are moving to a strict discovery model by default: when `--ops <dir>` is present, Wesley will recursively compile all `**/*.op.json` files (configurable with `--ops-glob`), fail if none are found unless `--ops-allow-empty` is provided, and sort files deterministically. A manifest mode (`--ops-manifest`) will be available for curated control (include/exclude lists). See the design note in `docs/drafts/2025-10-08-ops-discovery-modes.md`.
Expand All @@ -133,3 +223,16 @@ We are moving to a strict discovery model by default: when `--ops <dir>` is pres
- RLS defaults phase 2 and pgTAP for policies generated from annotations.

See also: docs/drafts/2025-10-03-rfc-query-ops-to-sql-qir.md

## Security defaults and search_path

By default, emitted functions use `SECURITY INVOKER` and do not modify `search_path`. For hardened deployments you can:

- Switch to definer: `--ops-security definer` when the op must bypass caller RLS and you’ve audited the body.
- Pin lookup path: `--ops-search-path "pg_catalog, <your_app_schema>"` to avoid unexpected name resolution via the session’s `search_path`.

You can also call `emitFunction(name, plan, { security, setSearchPath })` directly when embedding in custom tooling.

## Identifier policy for reserved words (strict mode)

Strict mode validates identifiers and errors on reserved keywords (e.g., table named `order`). This avoids ambiguous SQL and unexpected behavior from partial quoting. If you must work with legacy schemas containing reserved names, either rename at source or compile with `--ops-allow-errors` to skip failing ops while you migrate. A future policy option may allow strict‑but‑quoted rendering; for now, failure is explicit by design.
131 changes: 131 additions & 0 deletions docs/spec/ir-family-spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# The Wesley IR Family — Specification and Design

Wesley treats “IR” not as a single type but as a small, coherent family of representations that each speak for a different stage in the pipeline. This document explains the family in prose, shows how the pieces fit, and links each member to the JSON Schema that keeps it honest. The intent is to make the pipeline self‑describing, machine‑validated, and easy to reason about from your editor to CI.

At a high level, the Schema IR captures the shape of your world, the Query IR (QIR) captures what you want to read from that world, the Plan and REALM IRs explain and rehearse the impact of changes, and the Envelope bundles proofs and artifacts so humans and tools can trust the result.

```mermaid
flowchart LR
subgraph Authoring
SDL[GraphQL SDL]
end

subgraph Core
SIR[Schema IR\n(schemas/ir.schema.json)]
QIR[Query IR\n(schemas/qir.schema.json)]
PLAN[Plan IR\n(schemas/plan.schema.json)]
REALM[REALM IR\n(schemas/realm.schema.json)]
EVI[Evidence & Scores\n(schemas/evidence-map.schema.json,\nschemas/scores.schema.json)]
ENV[IR Envelope\n(schemas/ir-envelope.schema.json)]
end

CLI[Wesley CLI]
SQL[(PostgreSQL)]

SDL -->|parse| SIR
SIR -->|generators| SQL
SIR -->|op→plan & metadata| QIR
QIR -->|lowerToSQL| SQL
CLI -->|plan --explain| PLAN
CLI -->|rehearse --dry-run| REALM
SIR --> EVI
QIR --> EVI
PLAN --> EVI
REALM --> EVI
EVI --> ENV
SIR --> ENV
QIR --> ENV
```

## Goals and non‑goals

The IR family exists to separate concerns while keeping the whole understandable. Each IR has a narrow, testable contract, a JSON Schema living under `schemas/`, and at least one CLI path that validates instances against that schema. The design deliberately avoids conflating static metadata (Schema IR) with executable query plans (QIR), and it keeps change‑review artifacts (Plan/REALM) independent from the authoring formats.

It is not a goal to create a monolithic “one IR to rule them all.” Instead, the Envelope provides a single bundle for audits and transport when that is useful.

## Schema IR

The Schema IR is the canonical, declarative description of tables, columns, keys, directives, and tenancy metadata derived from GraphQL SDL. Generators produce DDL, policies, types, and tests from this IR. The diff engine reads two versions of it to compute additive migrations. Its contract is frozen in `schemas/ir.schema.json`.

Schema IR prefers clarity over cleverness. Field names, types, and directives are explicit; defaults are visible; keys and indexes are first‑class. The IR remains agnostic of query intent: there are no where‑clauses or projections here because those belong to QIR.

## Query IR (QIR)

Where Schema IR names your furniture, QIR tells you how to walk around it. A QIR plan is a small, composable description of a read operation: a relation tree (tables, joins, subqueries, laterals), a projection (columns or computed JSON objects), predicates, and ordering. QIR compiles deterministically to SQL.

QIR is defined by `schemas/qir.schema.json`. The code that lowers it lives in `@wesley/core/domain/qir`. Lowering deliberately separates two responsibilities: identifier policy (minimal or strict quoting) and deterministic tie‑breakers. For the latter we thread a `pkResolver` from Schema IR so ORDER BY can rely on real primary keys rather than assumptions. Parameters are explicit (`ParamRef`) and ordered deterministically so `$1..$N` binding is stable.

```mermaid
sequenceDiagram
participant Dev
participant Builder as Op→Plan Builder
participant Q as QIR
participant L as Lowerer (lowerToSQL)
participant S as Schema IR
participant DB as PostgreSQL

Dev->>Builder: describe operation (columns, filters, lists)
Builder-->>Q: produce QueryPlan tree
L->>S: ask pkResolver(table) for stable tie-breaker
L-->>Q: render SELECT with WHERE, ORDER, LIMIT/OFFSET
L->>DB: emit view or function wrappers
```

## Plan IR (proposed)

Human reviews hinge on two questions: “What will happen?” and “How risky is it?” The Plan IR answers both for `wesley plan --explain --json`. It describes phases (expand/backfill/validate/switch/contract), steps within each phase, a lock classification per step, and a succinct SQL preview. The schema will live at `schemas/plan.schema.json`. The CLI will validate its own JSON output against that schema during tests.

Plan IR is intentionally descriptive, not prescriptive. It does not execute; it just explains. The contract makes lock levels explicit and serializes them as part of the reviewable artifact so CI can assert “no ACCESS EXCLUSIVE locks appear in expand.”

## REALM IR (proposed)

Rehearsal (`wesley rehearse --dry-run --json`) deserves a stable result format. REALM IR captures the rehearsal verdict, timings, relevant counters, and any structured notes. Its schema will live at `schemas/realm.schema.json`. By pinning shape and fields, we let CI gate on objective facts: how many tests ran, which ones failed, how long phases took, and whether the environment matched expectations.

## Evidence and scores

Wesley emits evidence that links IR elements to generated artifacts and assigns coarse scores. These are already validated by `schemas/evidence-map.schema.json` and `schemas/scores.schema.json`. Evidence augments, never replaces, the IRs themselves. You can think of it as the “footnotes” and “grading rubric” for the pipeline’s claims.

## Envelope

The Envelope bundles a particular version of Schema IR, a set of QIR plans, and the evidence needed to believe the claim “this is what we built and how we validated it.” Its schema lives at `schemas/ir-envelope.schema.json`, and you can validate an instance with `wesley qir envelope-validate <file>`. In practice, envelopes travel to CI, auditors, and maybe future UIs.

```mermaid
flowchart TB
subgraph Bundle
E1[Schema IR]
E2[QIR Plans]
E3[Evidence & Scores]
E4[Optional: Plan & REALM]
end

B((ir-envelope.json))
E1 --> B
E2 --> B
E3 --> B
E4 --> B
```

## Validation in practice

Validation is part of development, not an afterthought. The JSON Schemas live in the repository and the CLI uses Ajv to validate:

- QIR plans: `wesley qir validate plan.qir.json`.
- Envelope: `wesley qir envelope-validate path/to/ir-envelope.json`.
- Evidence bundles: `wesley validate-bundle --bundle .wesley --schemas schemas/`.

CI also exercises these validators in Bats tests to keep the spec aligned with the code paths that produce and consume the IRs.

## Versioning and compatibility

Each schema is versioned under SemVer and tightened only when the benefit outweighs the churn. Breaking changes are rare and explicit. The Envelope includes a version string and can carry both old and new shapes transiently during migrations, but the general rule is forward‑only with compatibility shims living in code, not in the spec.

## Security notes

The IR family actively narrows the surface for mistakes. QIR uses explicit `ParamRef` with type hints so lowering never concatenates raw values. Identifier rendering is policy‑driven; strict mode quotes deterministically, and we validate or sanitize names when producing operation wrappers. When RLS is enabled, we prefer the database to enforce access rather than compiling redundant filters.

## What comes next

Two additional schemas will complete the family’s developer loop: Plan IR and REALM IR. As soon as those land, the CLI will validate `plan --explain --json` and `rehearse --dry-run --json` against their schemas in tests. An optional Ops Manifest schema will follow to support curated discovery in large repos.

The total surface then looks like a modest set of JSON Schemas, referenced by the CLI and by docs, moving in lockstep with the code. It’s deliberately small and intentionally boring—exactly the point for infrastructure we intend to trust.

Loading
Loading