From 53ed31500f34361815bcd2c486e09a6d70af479d Mon Sep 17 00:00:00 2001 From: Petyo Ivanov Date: Fri, 17 Apr 2026 19:50:50 +0300 Subject: [PATCH] Plans for documentation on prompt management --- plans/prompt-mgmt-docs-structure.md | 246 +++++++++++++++++++++++ plans/prompt-mgmt-release-concerns.md | 273 ++++++++++++++++++++++++++ 2 files changed, 519 insertions(+) create mode 100644 plans/prompt-mgmt-docs-structure.md create mode 100644 plans/prompt-mgmt-release-concerns.md diff --git a/plans/prompt-mgmt-docs-structure.md b/plans/prompt-mgmt-docs-structure.md new file mode 100644 index 000000000..4a291f1e9 --- /dev/null +++ b/plans/prompt-mgmt-docs-structure.md @@ -0,0 +1,246 @@ +# PRP 15: Prompt Management — Documentation Structure + +## Goal + +Define the structure, page-by-page content, and drafting sequence for customer-facing documentation of Prompt Management. The structure must reflect the product as it actually ships — not as the `petyo/prompt-management` branch is currently implemented — so several pages are blocked on the product changes captured in [`prompt-mgmt-release-concerns.md`](./prompt-mgmt-release-concerns.md). + +After this PRP: + +- there is a clear outline of what pages exist and who each one is for +- each page has a brief describing content, must-include points, and claims that must **not** be made yet +- the drafting sequence is ordered so writers start with pages whose truth does not depend on unresolved code decisions +- planning artifacts (this PRP, `meta-plan.md`, `variables-backing-pivot.md`, prior PRPs) are explicitly marked as non-customer material and are not referenced from the customer docs + +## Why + +- The feature is backed by a non-trivial data model (prompts as `kind='prompt'` rows in `variable_definitions`, versioned templates, autosaved settings, scenarios, runs) that does not map onto a typical "prompt library" mental model. Docs have to teach the model, not just list APIs. +- Three audiences consume this feature differently: prompt authors in the UI, SDK integrators in application code, and project admins who configure gateways and permissions. A single linear doc does not serve them. +- Several concerns from the pre-release review affect what can be truthfully written today. The structure below is explicit about which pages must wait. + +## Audiences + +Three entry points, three different first pages: + +- **Prompt authors** (PMs, prompt engineers): land on the Guides section; want "Create your first prompt" and "Ship a prompt to production". +- **SDK integrators** (app developers): land on the SDK guide; want payload shape, grammar, and a working code sample. +- **Project admins**: land on Administration; want plan/gateway prerequisites, permissions, and feature enablement. + +## Structure + +``` +Prompt Management +├── 1. Overview +│ ├── What it is +│ ├── How prompts relate to Managed Variables +│ └── When to use prompts vs. the Playground +├── 2. Concepts +│ ├── The four nouns: prompts, versions, scenarios, runs +│ ├── Identifiers: display_name, slug, internal name +│ ├── What a version freezes (template only) +│ ├── Autosaved settings vs. versioned template +│ ├── Template variables and the {{prompt}} reserved variable +│ └── Anatomy of a scenario (messages, parts, tool calls/returns) +├── 3. Guides (task-oriented) +│ ├── Create your first prompt +│ ├── Write templates with variables +│ ├── Define scenarios (system/user/assistant/tool turns) +│ ├── Run a prompt against a single scenario +│ ├── Run a prompt across a dataset (batch run) +│ ├── Save, compare, and promote versions +│ ├── Configure tools for a prompt +│ ├── Ship a prompt to production (cross-surface workflow) +│ └── Use prompts from your app via the SDK +├── 4. Template reference +│ ├── Supported grammar (exact, closed set) +│ ├── Variable naming rules +│ ├── Reserved variables ({{prompt}}) +│ └── Error messages +├── 5. API & SDK reference +│ ├── SDK: fetching prompts via /v1/variables +│ └── Payload shape (kind='prompt', template-only version contract) +├── 6. Administration +│ ├── Enabling the feature +│ ├── Plan requirements +│ ├── Gateway setup +│ ├── Permissions model +│ └── Naming strategy +└── 7. Known limitations & roadmap +``` + +## Page-by-page briefs + +Each brief states: who it is for, what it must contain, and any claim that cannot be made until a specific release-concern is resolved. + +### 1.1 What it is +- **For:** all audiences. +- **Must contain:** one-paragraph positioning, single annotated screenshot of the prompt editor, a diagram showing the four nouns. +- **Must not:** claim block-helper template features, imply versions freeze model/tool settings, show an example that pins a version to production without crossing to the variables page. + +### 1.2 How prompts relate to Managed Variables +- **For:** all audiences, especially users who already use managed variables. +- **Must contain:** honest explanation that prompt rows coexist in `variable_definitions` under `kind='prompt'`; a callout that prompts appear in both the Prompts list and the Managed Variables list; a diagram of the four responsibilities (authoring on Prompts page, rollout on Variables page). +- **Blocked on:** concern 3 read-only surfacing (**if** "Currently serving: vN" lands in Prompts UI, this diagram changes). Draft a placeholder version now; rewrite after the code lands. + +### 1.3 When to use prompts vs. the Playground +- **For:** new users deciding where to start. +- **Must contain:** short decision table — Playground for one-off exploration, Prompts for anything intended to ship. + +### 2.1 The four nouns +- **For:** all audiences. +- **Must contain:** prompt, version, scenario, run — one paragraph each with a small worked example. Cardinalities: 1 prompt : N versions, 1 prompt : N scenarios, 1 prompt : N runs, 1 run : N run cases. +- **Evidence:** cite `logfire_db/crud/prompts.py` types if readers want to drill in. + +### 2.2 Identifiers +- **For:** anyone who will see the three names. +- **Must contain:** the three-column table from `release-concerns.md` §7 (display_name / slug / internal name). Explain the `prompt__` prefix the SDK sees today. +- **Blocked on:** concern 7 SDK helper. If `logfire.prompt(slug=...)` lands, this page collapses; the prefix becomes an implementation detail. + +### 2.3 What a version freezes +- **For:** anyone who will save or pin a version. +- **Must contain:** an explicit statement that saving a version freezes the template text and nothing else. A worked example showing that changing the model between runs uses the new model on both old and new versions. A "what to do if you need the old model back" footnote pointing at run history snapshots. +- **Blocked on:** concern 2 decision. If the team adopts the two-tier "publish" model, this page rewrites substantially. Draft content now but do not publish until the versioning decision is made. + +### 2.4 Autosaved settings vs. versioned template +- **For:** same readership as 2.3. +- **Must contain:** a small diagram of the two lifecycles — template updated on Save Version, settings updated immediately on every edit. A note that runs snapshot the settings at execution time, so run records are reproducible even though versions are not. +- **Blocked on:** same as 2.3. + +### 2.5 Template variables and the `{{prompt}}` reserved variable +- **For:** all prompt authors. +- **Must contain:** how scenario variables are resolved; how `{{prompt}}` gets replaced by the rendered prompt inside scenario messages; rules for undefined variables. +- **Blocked on:** concern 1 grammar decision. Do not specify what features the template supports on this page until all three surfaces agree. + +### 2.6 Anatomy of a scenario +- **For:** prompt authors doing tool-calling work. +- **Must contain:** message roles (system/user/assistant/tool), part kinds (text, tool-call, tool-return), and a worked multi-turn example. + +### 3.1 Create your first prompt +- **For:** new prompt authors. +- **Must contain:** screenshots, not text-only. Prereqs at the top (plan eligibility, gateway configured). End-to-end walkthrough: create, write template, add one variable, define default scenario, preview, Run. +- **Blocked on:** concerns 1, 4, 5. Waits until grammar, release shape, and gateway UX are resolved. + +### 3.2 Write templates with variables +- **Blocked on:** concern 1. The whole page is grammar-specific. + +### 3.3 Define scenarios +- **For:** authors moving beyond single-turn. +- **Must contain:** adding system/user/assistant turns, adding tool calls and tool returns, setting scenario variables. + +### 3.4 Run a prompt against a single scenario +- **For:** authors iterating on a single case. +- **Must contain:** how the Run button works, where output appears, where it is stored in run history. Cost note: each run spends gateway budget. +- **Blocked on:** concern 5 prerequisite flow. + +### 3.5 Run a prompt across a dataset +- **For:** authors doing evaluation. +- **Must contain:** linking a dataset, variable column mapping, `max_cases` behavior (max 500 per call), where results land, how to view per-case output. +- **Must include** a clear cost warning — batch runs spend real budget. +- **Blocked on:** concern 6. Should not be published before a cost ceiling is added; otherwise this page is a footgun tutorial. + +### 3.6 Save, compare, and promote versions +- **For:** authors going from draft to shipped. +- **Must contain:** save version flow; version diff UI (template only — call this out explicitly); where promotion happens (variables page today). +- **Blocked on:** concerns 2 and 3. If "publish snapshot" or inline label controls land, this page rewrites. + +### 3.7 Configure tools for a prompt +- **For:** authors using tool-calling. +- **Must contain:** tools editor, tool schema, note that tools are autosaved (not versioned). + +### 3.8 Ship a prompt to production +- **For:** any author moving from iteration to deployment. **This is the load-bearing page.** +- **Must contain:** the four-step cross-page workflow (save version in Prompts → assign label on Variables → SDK calls with label → confirm serving state). Screenshots of both surfaces. An explicit "your app still serves the old version until you assign the label" warning. +- **Blocked on:** concern 3 — should not publish until at least a read-only "Currently serving" indicator exists in the Prompts UI. Otherwise the walkthrough requires users to trust out-of-band knowledge. + +### 3.9 Use prompts from your app via the SDK +- **For:** app developers. +- **Must contain:** payload shape returned by `/v1/variables/`, label-based fetching, rendering the template locally, passing rendered output to the user's model client. Working end-to-end sample. +- **Blocked on:** concerns 1, 7. Needs the rendering helper or a documented grammar contract, and ideally `logfire.prompt(slug=...)` to hide the prefix. + +### 4. Template reference +- **For:** authors and SDK users needing the authoritative grammar. +- **Must contain:** closed-set list of supported tokens, with examples of supported and **unsupported** usage. +- **Blocked on:** concern 1. This page cannot exist until the grammar is one thing across all three surfaces. + +### 5.1 SDK: fetching prompts +- **For:** app developers. +- **Must contain:** fetching with `logfire.var(...)` (or `logfire.prompt(...)` post-helper), the `kind='prompt'` discriminator, attempting to write back raises `PromptVariableMutationNotAllowed`. + +### 5.2 Payload shape +- **For:** app developers doing schema validation. +- **Must contain:** JSON shape of `serialized_value` for prompt versions (template only, today), and the invariants the server enforces. +- **Blocked on:** concern 2. If payload widens to include settings/tools, schema changes. + +### 6.1 Enabling the feature +- **For:** project admins. +- **Must contain:** plan eligibility check, current release state (Preview/GA), how to opt in if Preview. +- **Blocked on:** concern 4. Needs a coherent release shape defined first. + +### 6.2 Plan requirements +- Brief page; cross-link to billing docs. + +### 6.3 Gateway setup +- **For:** project admins. +- **Must contain:** gateway enablement, key creation, role-dependent paths, legacy vs. integrated distinction. +- **Blocked on:** concern 5. If the gateway prereq funnel is consolidated or inline repair lands, the page shrinks considerably. + +### 6.4 Permissions model +- **For:** security reviewers and project admins. +- **Must contain:** current mapping (`read_variables` / `write_variables` gate all prompt capabilities), explicit call-out that run history is visible to all `read_variables` holders, batch-run cost implications. +- **Blocked on:** concern 6. If `read_prompts` / `write_prompts` / `run_prompts` land, this page gets a cleaner matrix. + +### 6.5 Naming strategy +- **For:** admins with large variable inventories. +- **Must contain:** slug rules, hyphen/underscore collapse behavior, the `prompt__` prefix convention. +- **Blocked on:** concern 7 slug validation fix (ideally). + +### 7. Known limitations & roadmap +- **For:** all audiences, referenced defensively from other pages. +- **Must contain:** honest list of current limitations (versioning scope, coarse permissions, two-place workflow, grammar status). This page is the pressure valve that lets earlier pages stay crisp. + +## Drafting sequence + +### Can be drafted now (not blocked on any concern) +- §1.1 Overview: What it is (placeholder diagram OK) +- §1.3 When to use prompts vs. Playground +- §2.1 The four nouns +- §2.6 Anatomy of a scenario +- §3.3 Define scenarios +- §3.7 Configure tools for a prompt +- §6.2 Plan requirements +- §7 Known limitations (this is the defensive page — can ship before product fixes) + +### Can be drafted but not published until a single concern is resolved +- §1.2 How prompts relate to Managed Variables → concern 3 +- §2.2 Identifiers → concern 7 +- §2.3 / §2.4 Versioning and settings → concern 2 +- §3.6 Save/compare/promote → concerns 2, 3 +- §3.8 Ship to production → concern 3 +- §5.1 SDK fetching → concern 7 +- §5.2 Payload shape → concern 2 +- §6.1 Enabling the feature → concern 4 +- §6.3 Gateway setup → concern 5 +- §6.4 Permissions → concern 6 +- §6.5 Naming strategy → concern 7 + +### Blocked on the grammar decision (concern 1) +- §2.5 Template variables +- §3.1 Create your first prompt +- §3.2 Write templates with variables +- §3.4 Run a single scenario +- §3.5 Run across a dataset (also blocked on concern 6) +- §3.9 SDK usage (also blocked on concern 7) +- §4 Template reference + +## Out of scope + +- Public publication of `meta-plan.md`, `variables-backing-pivot.md`, `notes.md`, `transcript.md`, `ux-prototype-plan.md`, or any `plans/*prompt-mgmt*.md` file. These are engineering artifacts and remain internal. +- Reference documentation for `ui_api/projects/prompts.py`. The UI API is not a public surface; only `/v1/variables/` is. +- Migration guides. The feature is new; there is no prior-version migration story. +- Per-prompt permissions documentation. Not planned for v1. + +## Success criteria + +- Every page in the structure has a brief. +- Every brief states its blocking concern explicitly, or confirms none. +- A writer reading this PRP plus `docs/prompt-management/release-concerns.md` can decide for each page: "draft now", "draft and hold", or "wait". +- No customer-facing page makes a claim that is untrue on any of the three surfaces (editor preview, UI Run, SDK consumption) at the time of publication. diff --git a/plans/prompt-mgmt-release-concerns.md b/plans/prompt-mgmt-release-concerns.md new file mode 100644 index 000000000..39b1ab809 --- /dev/null +++ b/plans/prompt-mgmt-release-concerns.md @@ -0,0 +1,273 @@ +# Prompt Management: Pre-Release Concerns + +_Created: 2026-04-17_ +_Status: Review input for release planning and docs effort_ + +This document summarizes concerns identified while reviewing the `petyo/prompt-management` branch against the goal of writing customer-facing documentation. Each concern is described with enough evidence to be actionable, and ranked by the severity of its effect on either the release or the docs that describe it. Concerns that are purely repo hygiene (stale planning artifacts, naming collisions) are intentionally out of scope here. + +## TL;DR + +Seven concerns, four of which should be resolved in code before any customer-facing documentation is written: + +1. **Template rendering is inconsistent across three surfaces.** Editor preview uses real Handlebars, the UI Run path uses a regex, and SDK consumers have no renderer at all. Docs cannot truthfully describe what a template is until these agree. +2. **Saved versions freeze only the template.** Model, tools, and settings live in a mutable autosaved row. A pinned "version 2" can behave differently at different times. +3. **Rollout lives on one page, authoring on another.** "Which version is live?" is invisible from the Prompts UI; promoting a version requires crossing over to the Managed Variables detail page. +4. **The feature has already released to the API but not to the UI or the SDK.** Frontend is localStorage-flag-gated, backend endpoints are live, SDK lacks idiomatic helpers. The three surfaces need to land coherently before GA. +5. **Gateway prerequisites are an eight-state funnel with no inline repair.** First-run users bounce through unrelated settings pages before they can click Run. +6. **Permissions are coarse.** A single `write_variables` bit covers iterate, promote, single-run, and 500-case batch runs. +7. **Each prompt has three identifiers and the internal one leaks to SDK callers.** `display_name`, `prompt_slug`, and `name = prompt__`; SDK users must know the prefix convention. + +Concerns 1–4 are release-blocking; 5–7 are ship-acceptable if docs are honest about the state, but each has product work that would pay for itself quickly. + +--- + +## 1. Template rendering is inconsistent across three surfaces + +The same template renders differently in three places. + +| Surface | Renderer | Grammar honored | +|---|---|---| +| Editor preview | real Handlebars (`handlebars.create()`) — `src/app/prompts/lib/prompt-rendering.ts:1,37` | identifiers, nested paths, block helpers (`{{#if}}`, `{{#each}}`), helpers | +| UI Run / batch | regex `\{\{\s*([a-zA-Z0-9_.-]+)\s*\}\}` — `services/prompt_rendering.py:10` | literal `{{name}}` substitution; dots allowed in identifiers but not treated as nested paths | +| SDK consumer | no renderer ships; user writes their own | whatever the consumer implements (the demo hand-rolls a regex in `demo_prompt_variables_pydantic_ai.py:72-78`) | + +The CodeMirror editor also uses Handlebars **syntax highlighting** (commit `9caa065777`), which asserts editor-level support for features the server does not honor. + +### Why it matters + +- A user writing `{{#each items}}{{name}}{{/each}}` sees per-item output in preview, then clicks Run and the model receives the raw `{{#each items}}…{{/each}}` literal. No error, no warning. +- `{{user.name}}` has two incompatible meanings: nested lookup in the frontend, flat identifier `user.name` on the backend. +- `scenario_variables: dict[str, str]` at the schema layer (`PromptDraftRunRequest`) rules out the data shapes block helpers would need. The schema forecloses what the editor advertises. +- SDK consumers have no grammar contract. Every app picks its own renderer, and at least one of {editor preview, UI Run, production} will disagree with the other two. + +### What needs to change + +Pick a grammar and enforce it on all three surfaces. Three defensible exits: + +- **Narrow everywhere to identifier substitution.** Frontend renderer drops to plain substitution; remove Handlebars highlighting. Smallest technical fix, meaningful product regression. +- **Widen backend to Handlebars.** Needs a Python Handlebars implementation (`pybars3` is unmaintained) and `scenario_variables` typed as `dict[str, JSONBData]` across the API and the inputs panel. +- **Declare grammar via a typed contract.** Publish `{ grammar: "logfire-v1" }` in prompt payloads; support a minimal grammar by default and an advanced grammar on opt-in. Most work, best long-term product answer. + +Regardless of exit, two things are needed before docs: + +- A shared cross-renderer fixture suite (promised in `variables-backing-pivot.md`, not yet present). +- An SDK rendering helper (e.g., `logfire.render_prompt(template, variables)`) so the canonical grammar has a reference implementation and SDK users stop inventing their own. + +--- + +## 2. Versioning scope covers only the template + +### What gets versioned + +```python +# logfire_db/crud/prompts.py:910 +serialized_value = _serialize_prompt_template(request['template']) +# which is: +def _serialize_prompt_template(template: str) -> str: + return json.dumps(template) +``` + +`variable_versions.serialized_value` for a prompt contains the template string and nothing else. + +Model, tools, `api_format`, `route`, `model_settings`, and `stream` live in `prompt_settings`, primary-keyed on `prompt_id`, overwritten on every settings change (`final_schema_logfire.sql:1749-1765`). Runs snapshot the settings that were live at execution time — but versions do not. + +### Why it matters + +- "Version" in competing products (Humanloop, LangSmith, PromptLayer) freezes the full callable. Here it freezes the template string only. Users pinning to "version 2" get v2's template rendered against *whatever settings are configured right now*. +- Rollback is partial. If someone sets `temperature=2.0` and production breaks, rolling back a version does nothing — the setting isn't attached. +- Any "compare versions" UI shows template diffs only. Model or tool changes are invisible. +- The SDK contract consumes this directly: `/v1/variables/` ships `{"template": "..."}`. Authoring decisions about model and tools never reach the app. The demo's `ManagedPromptPayload` declares `model`, `tools`, `api_format` as optional-with-`None` defaults and then raises `SystemExit('Gateway mode requires the prompt version to have a stored model')` at line 102 — evidence of code written against an interface that does not yet exist on the server. + +### What needs to change + +Three resolutions, in roughly increasing scope: + +- **Document the narrow contract honestly.** Rename the concept ("template history") and remove any UI affordance that implies broader snapshotting. Smallest change, weakest product. +- **Widen versions.** Snapshot settings and tools into the version row. Breaks the autosaved-settings UX that the branch is built around. +- **Two-tier model.** Keep autosaved settings as a working scratchpad; add an explicit "publish" action that freezes everything into a version. This is where most competitors land. + +Docs must in all cases state clearly that "version" in Logfire Prompts today means template only. + +--- + +## 3. Rollout lives on a different page from authoring + +Prompt-backed rows render in the Managed Variables list with explicit disclosure (`project-settings-variables.tsx:278-339`): a "Prompt" badge, status "Backs a prompt", and inline copy "Open this row here to edit values and routing. Prompt metadata stays in Prompt Management." + +The disclosure is honest. The workflow it implies is the problem. + +### The responsibility split + +| Task | Page | +|---|---| +| Edit template / iterate on v4 | `/prompts//` | +| Change model / tools / settings | `/prompts//` | +| Rename / describe the prompt | `/prompts//` | +| **Point label `production` at v3** | `/variables/prompt__/` | +| **See which version currently serves traffic** | `/variables/prompt__/` | +| **Configure percentage rollout** | `/variables/prompt__/` | + +### Why it matters + +The second half of the table is the entire point of versioning. Without labels, a saved version is inert — the SDK has no way to ask for it without hard-coding a numeric version. None of that lives in the Prompts UI. + +A realistic "ship to production" workflow today is: + +1. Open `/prompts/welcome_email/`, iterate, save v4. +2. Leave. Navigate to Managed Variables. Find `prompt__welcome_email`. +3. Move the `production` label from v3 to v4 on the values/labels tab. +4. Hope the SDK is calling with `label='production'`. + +At step 1 the user thinks they have shipped. They have not — their app is still serving v3 until step 3. The Prompts UI never indicates this. The version history shows "v4 by you" with no "currently serving: v3" signal. + +### What needs to change + +- **Mirror serving state into the Prompts UI.** At minimum, a "Currently serving: v3" strip next to the version selector, even if editing still happens on the variables page. This is a read-only fix and addresses the worst of the concern. +- **Eventually, mirror the label controls.** Let authors promote versions from the Prompts page directly. The variables detail stays as the advanced/legacy view. + +Docs independently need a "Shipping a prompt to production" page that teaches the cross-over explicitly. Without that page, the feature silently fails the first time a user tries to use versions for their stated purpose. + +--- + +## 4. The feature has already released to the API but not to the UI or the SDK + +### Gating state + +- Backend routes (`ui_api/projects/prompts.py`, `/v1/variables/` with `kind='prompt'`): live, unconditional. +- Frontend: `LocalFeatureFlagProtectedRoute` on `PROMPT_MANAGEMENT`. Flag is **not** in `SHIPPED_FEATURE_FLAGS` (`useFeatureFlag.ts:29`), defaults off, toggled per-browser via localStorage. +- SDK: no idiomatic helper. Users call `logfire.var(name='prompt__', ...)` and hand-roll a renderer (per the demo). +- Database migrations: shipped with the backend; run regardless of flag state. + +### Why it matters + +- `VariableConfig.kind` is now a new field returned to every SDK caller, including consumers running today against prod. Strict-schema validators could break without warning. +- Anyone who figures out `/ui_api/projects//prompts/` can create real prompts in real projects while the UI still says the feature is unavailable. Rows appear in the Managed Variables list immediately. +- `LocalFeatureFlagProtectedRoute` is a per-browser toggle. There is no org-level or project-level gating. Targeted rollout (e.g., design partners) is not supported without new server-side work. +- Once migrations apply, the schema is in every project whether the UI is on or not. Back-out after any customer writes prompt data is data-destructive. + +### What needs to change + +Three release states must be defined and aligned before docs are written against any of them: + +| State | API | UI | SDK helpers | Audience | +|---|---|---|---|---| +| Now | live | hidden | bare | internal only | +| Preview | live | opt-in | minimal documented surface | design partners | +| GA | live | default on | idiomatic `logfire.prompt(...)` | all | + +Docs written today against "Now" would describe a workflow requiring a UI the user cannot open. Writing docs against "Preview" is workable if a `logfire.prompt(...)` (or equivalent) lands in the SDK first. Writing docs against "GA" requires the flag flip and ideally org-level gating. + +Related: 14 `plans/2026-04-*prompt-mgmt*.md` and `docs/prompt-management/*` files live in the repo and contain engineering-internal language (pivots, acknowledged tradeoffs). Any docs-publishing pipeline should explicitly exclude these paths. + +--- + +## 5. Gateway prerequisites are an eight-state funnel with no inline repair + +`prompt-gateway-requirements.ts` resolves to one of eight terminal states: `ready` (integrated or legacy), `loading`, `query-error`, `unsupported-plan`, `gateway-disabled`, `missing-key` (no-keys or missing-token-permission), `selected-key-unavailable`, `select-key`. Each non-`ready` state suggests an action but does not provide inline repair — every recovery path requires leaving the prompt page. + +### Why it matters + +- First-run users bounce through 2–3 unrelated settings pages before their first Run button works. +- The `unsupported-plan` state is a marketing leak: the Prompts feature should not render in the sidebar for ineligible orgs. Showing the feature and blocking Run with "upgrade your plan" loses the evaluation. +- Two gateway systems coexist: `hasLegacyGatewayConfig` returns `ready: legacy`; users on the legacy setup never see the integrated flow. Docs must describe both, or the team picks one to deprecate before GA. +- Role-dependent copy assumes users know their role (`isOrgAdmin`, `isProjectAdmin`). Users often do not. A small "You are a {role}" line in the prerequisite card would remove most of this. +- None of this applies to the SDK path. The gateway is a UI-only dependency. Docs must draw this line sharply: Editor Run needs a Logfire gateway key; production consumption via SDK does not. + +### What needs to change + +Product fixes that are cheap relative to their impact: + +- Inline "Create gateway key" modal on the prompt page for project admins. +- Hide the Prompts nav entry when `isGatewayPlanEligible === false`. +- Pick a gateway system (integrated vs. legacy) and start a deprecation. +- Add a user-role indicator to the prerequisite card. + +Docs should defer the per-role playbook until it is clear which funnel shape ships. A single "Prepare your project for prompt runs" page must precede any "Create your first prompt" walkthrough. + +--- + +## 6. Permissions are coarse + +Every prompt route uses one of two permissions: + +```python +# ui_api/projects/prompts.py — all 23+ route declarations +dependencies=[Permissions(organization=None, project=['read_variables'])] +# or +dependencies=[Permissions(organization=None, project=['write_variables'])] +``` + +No `read_prompts` or `write_prompts` was added; the `test_roles.py` diff is alphabetical re-sorting only. + +### The conflations + +| Capability | Gated by | Risk | +|---|---|---| +| View template and settings | `read_variables` | low | +| View run history (inputs, model outputs, costs) | `read_variables` | data-sensitive | +| Save a draft version | `write_variables` | low | +| Promote a version to a label (on variables page) | `write_variables` | production-sensitive | +| Execute a single run | `write_variables` | cost-sensitive | +| Execute a batch run up to 500 cases | `write_variables` | heavily cost-sensitive | + +### Why it matters + +- Run records include model outputs. Anyone with `read_variables` can read every response produced by the project — a permission named for config access is now gating model output data. +- "Iterate on a draft" and "promote to production" are the same bit. Competing products separate author from publisher for good reason. +- Batch runs spend real gateway budget. No cost ceiling, no approval flow, no second-confirmation. `max_cases` accepts up to 500 per call. Fat-finger incident waiting to happen. +- Existing `read_variables` assignments (granted when it meant config-read) now silently cover prompt run history. That is a blast-radius expansion without notification. + +### What needs to change + +None of this blocks v1 for small teams, but before GA with any enterprise messaging: + +- Introduce `read_prompts` / `write_prompts` as separate permission names, even if they map 1:1 to variables today. Cheap seam for future tightening without schema migration. +- Split `run_prompts` from `write_prompts` so interns can iterate without spend. +- Add a batch-run cost ceiling (per-user or per-project daily cap) and surface it. +- Document the current scope explicitly, including the "run history is visible to all `read_variables` holders" note. + +--- + +## 7. Each prompt has three identifiers; the internal one leaks to SDK callers + +Namespace isolation between prompts and general variables is correctly enforced (`variable_definitions_prompt_identity_check` in `final_schema_logfire.sql:2110`, plus the `_validate_general_variable_name` guard at `services/variables.py:132`). Prompts and general variables cannot collide. + +The cost is three identifiers per prompt: + +| Field | Example | Who sees it | +|---|---|---| +| `display_name` | `"Welcome Email"` | UI title, search | +| `prompt_slug` | `welcome_email` | URL (`/prompts/welcome_email/`) | +| `name` | `prompt__welcome_email` | SDK callers (`logfire.var(name=...)`) | + +The `name` is derived from the slug via `demo_prompt_variables_pydantic_ai.py:50-51`: + +```python +def get_prompt_internal_variable_name(prompt_slug: str) -> str: + return f'{PROMPT_VARIABLE_PREFIX}{prompt_slug.replace("-", "_")}' +``` + +### Why it matters + +- The `prompt__` prefix is an internal storage convention appearing in user SDK code. The demo defines `get_prompt_internal_variable_name` as a helper users must call; that is evidence the boundary has not settled. +- `slug.replace("-", "_")` is lossy. Slugs `order-confirmation` and `order_confirmation` both derive to `prompt__order_confirmation`. The second creation attempt hits the `(project_id, name)` uniqueness constraint with an error that references a name the user never typed. +- The slug→name derivation is enforced in service code, not at the database layer. Any future change to the derivation breaks lookup for pre-existing rows. + +### What needs to change + +- Ship an SDK helper (`logfire.prompt(slug=...)` or equivalent) that hides the prefix. The prefix then becomes an implementation detail, not a public contract. +- Disallow either hyphens or underscores in new slugs at the input layer, so the derivation is collision-free by construction. +- Docs: one early Concepts table showing the three identifiers, which ones users touch, and which one the SDK consumes. New users need this before they can make sense of example code. + +--- + +## Recommended sequencing + +1. Pick a template grammar and make preview, Run, and SDK agree (concern 1). Ship the cross-renderer fixture suite and the SDK rendering helper. +2. Decide the release shape (concern 4): define Preview and GA states, align SDK helper, plan flag-flip timing. +3. Add "currently serving" visibility to the Prompts UI (concern 3) — read-only is enough for the first pass. +4. Decide the versioning story (concern 2) and either narrow the name or add a publish action. +5. Before GA: split permission names (concern 6), add a batch-run cost cap, hide the Prompts nav item on ineligible plans (concern 5). +6. Ship the SDK `logfire.prompt(...)` helper so the `prompt__` prefix stops leaking (concern 7). + +Docs can begin drafting admin and concept pages now. Template reference, SDK usage, and "Create your first prompt" should wait until concerns 1, 2, and 4 are resolved.