diff --git a/.gitignore b/.gitignore index 40b2195..e5a6dd4 100644 --- a/.gitignore +++ b/.gitignore @@ -71,10 +71,6 @@ skills/website-to-hyperframes .agents/ .claude/skills/ -# HyperFrames render artifacts (regenerable from compositions) -video/snapshots/ -video/renders/ - # Rendered timelapse mp4 — hosted externally for the hackathon submission demo-output/videos/ @@ -83,11 +79,6 @@ demo-output/videos/ /plan.md /research.md -# Claude Design polish handoff bundles — committed per-slot only after review -skills/webster-video/polish-slots/**/handoff/ -skills/webster-video/polish-slots/handoff-shared/ -skills/webster-video/polish-slots.zip - # Internal tracking docs — preserved in ~/Vault/Projects/webster/internal-tracking/ context/EXPANSION-TASKS.md context/E2E-IMPLEMENTATION-TRACKER.md diff --git a/AGENTS.md b/AGENTS.md index fc40374..549ba88 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -14,7 +14,7 @@ This file is for implementation operators. See `skills/webster-lp-audit/SKILL.md Two active workstreams: - **Production Webster** — Nicolette's weekly landing-page improvement council runs on `main`. Operator surface: `/webster-weekly-council` (skill at `skills/webster-weekly-council/SKILL.md`) or the single-page runbook at `prompts/second-wbs-session.md`. Both produce identical artifacts; the prompt is the locked source-of-truth runbook. This is live for her business; do not break it. -- **Hackathon expansion** — Single-substrate Richer Health LP demo with a simulation runner producing timelapse assets. Deadline **2026-04-28**. Working branch: `dev/`. See `context/VISION.md` for canonical north-star. +- **Single-substrate Richer Health LP demo** with a simulation runner producing 11-week timelapse assets under `demo-output/landing-page/`. See `context/VISION.md` for canonical north-star. ## First actions every session @@ -129,8 +129,9 @@ Use `TaskCreate` / `TaskUpdate` for multi-step work within a single session. Tas Webster ships these skills: - `skills/webster-weekly-council/SKILL.md` — operator surface for the weekly run. Library skill: SKILL.md index + on-demand phase references + helper scripts. Slash-command form: `/webster-weekly-council`. Equivalent single-page runbook at `prompts/second-wbs-session.md`. +- `skills/webster-onboarding/SKILL.md` — first-time setup for a new operator (brand context capture, key checklist, repo scaffold, agent + memory-store provisioning, first council) - `skills/webster-lp-audit/SKILL.md` — shared council run discipline (referenced by production critics) -- `skills/webster-browser-audit/SKILL.md` — headless browser audit for visual review +- `skills/webster-browser-audit/SKILL.md` — headless browser audit capability for visual review If your work modifies any skill, test with a sample invocation before committing. The weekly-council skill must stay artifact-equivalent with `prompts/second-wbs-session.md` — when in doubt, fix the skill, never the prompt. diff --git a/README.md b/README.md index 72a719e..1d06579 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ ## The one-line pitch -Small businesses pay marketing agencies $2K–$20K/month for landing-page optimization that arrives in 4–6 week cycles. Webster runs the audit + proposal loop for ~$0.60/month in Opus 4.7 tokens and hands the operator a reviewable draft PR each week. The win is cycle time (minutes vs weeks) and the baseline cost of the analytical loop — a human still reviews the PR before it ships. +A council of 9 Claude Managed Agents audits a landing page once a week, synthesizes findings across SEO, brand-voice, compliance, conversion, copy, and rendered-layout lenses, and hands the operator a reviewable draft PR. The win is cycle time — the analytical loop runs in tens of minutes instead of multi-week agency rounds — and a runtime mechanism (Critic Genealogy) where Opus 4.7 detects an unowned audit gap and registers a brand-new specialist agent against the live API mid-run. A human still reviews the PR before it ships. ## The hero moment — Critic Genealogy @@ -81,12 +81,24 @@ bun scripts/critic-genealogy.ts --fixtures scripts/__tests__/fixtures/genealogy **Live-run evidence:** the operator surface is the [`/webster-weekly-council`](skills/webster-weekly-council/SKILL.md) skill (library: SKILL.md index + on-demand phase references + helper scripts); the full single-page runbook lives at [`prompts/second-wbs-session.md`](prompts/second-wbs-session.md). Registration IDs live in `environments/webster-council-env.id` and `context/*/id.txt`. Run artifacts are written under `history//` when the weekly run executes. -**Demo arc artifacts:** the hackathon timelapse animates an 11-week simulation council run. Per-week deliverables live under [`demo-output/landing-page/`](demo-output/landing-page/) (`w00..w10`): desktop/mobile/tablet screenshots, heatmap JSON+SVG, synthetic analytics, and the visual reviewer's markdown verdict. Anthropic Managed Agents memory-store provisioning is captured at [`assets/memory-stores-screenshots/`](assets/memory-stores-screenshots/). The rendered timelapse is hosted externally (link in the submission form); reproduce locally with `bun skills/webster-video/scripts/hydrate-demo-assets.ts && cd video && npx hyperframes render -q high --strict`. +**Demo arc artifacts:** an 11-week simulation council run, week-by-week, browsable as files. Start at [`demo-output/landing-page/INDEX.md`](demo-output/landing-page/INDEX.md) for the narrated walk-through. Each week directory under [`demo-output/landing-page/w00..w10/`](demo-output/landing-page/) contains desktop/mobile/tablet screenshots, heatmap JSON+SVG, synthetic analytics, and the visual reviewer's markdown verdict. Anthropic Managed Agents memory-store provisioning is captured at [`assets/memory-stores-screenshots/`](assets/memory-stores-screenshots/). The render pipeline that turns these per-week assets into a timelapse video is submission tooling and lives outside the public repo. **Hero code:** [`scripts/critic-genealogy.ts`](scripts/critic-genealogy.ts) is the runtime specialist-spawn path; [`scripts/__tests__/critic-genealogy.test.ts`](scripts/__tests__/critic-genealogy.test.ts) and [`scripts/__tests__/fixtures/genealogy`](scripts/__tests__/fixtures/genealogy) are the fixture proof. **Validate locally:** run `bun install` once, then `bun run validate` for type-check, zero-warning lint, format, agent schemas, findings format, markdown, and tests. +## 5-minute judge tour + +If you're evaluating this submission and have five minutes: + +1. **Read the 30-second pitch + hero moment above** (you're here) — that's the architecture and the novel-mechanic claim in one screen. +2. **Open [`demo-output/landing-page/INDEX.md`](demo-output/landing-page/INDEX.md)** — narrated walk through the 11-week LP timelapse. One paragraph per week, links to that week's screenshots + heatmap + visual-reviewer verdict. +3. **Click into one week's `visual-review.md`** (e.g. [`w04/visual-review.md`](demo-output/landing-page/w04/visual-review.md) for the largest beat, [`w10/visual-review.md`](demo-output/landing-page/w10/visual-review.md) for the terminal polish) — that's what the council actually wrote about its own changes. +4. **Read [`scripts/critic-genealogy.ts`](scripts/critic-genealogy.ts)** — the hero file. Two tools (`report_no_gap` / `report_gap`), Opus 4.7 picks one, then drafts a JSON spec, registers it via `POST /v1/agents`, and invokes it via `POST /v1/sessions` — all at runtime. +5. **Optional, if a terminal is handy:** `bun install && bun scripts/critic-genealogy.ts --fixtures scripts/__tests__/fixtures/genealogy --dry-run`. Live Opus 4.7 call against the committed fixture findings, ~15s wall clock, prints the new critic spec it would have registered. + +[`agents/production/`](agents/production/) holds the 9 pre-registered specs; [`agents/simulation/`](agents/simulation/) holds the 1:1 simulation mirror used for the timelapse run. [`prompts/second-wbs-session.md`](prompts/second-wbs-session.md) is the production weekly orchestrator (locked); [`skills/webster-weekly-council/SKILL.md`](skills/webster-weekly-council/SKILL.md) is the same flow as a Claude Code skill. + ## What's in the repo ```text @@ -99,7 +111,9 @@ webster/ ├── prompts/ first-wbs-session.md (bootstrap), second-wbs-session.md (weekly run runbook) ├── scripts/ validate-agents, validate-findings, critic-genealogy ├── skills/ webster-weekly-council (operator surface for the weekly run), -│ webster-lp-audit (shared critic discipline) +│ webster-onboarding (first-time setup for a new operator), +│ webster-lp-audit (shared critic discipline), +│ webster-browser-audit (Playwright-headless audit capability) ├── .github/workflows/ CI: type + lint + format + schema + findings + markdown + tests ├── .husky/ pre-commit runs the same gates locally └── AGENTS.md operator guide for in-repo work @@ -117,7 +131,7 @@ The live council runner is a Claude Code library skill: [`/webster-weekly-counci 6. Runs the redesigner — commits `history/YYYY-MM-DD/proposal.md` + `decision.json`. 7. Opens a draft PR. -Expected wall-clock: 30–50 min. Expected API cost: ~$0.16–0.25 per run. +Wall-clock per run is in the tens of minutes; the bulk of that is the parallel critic fan-out, not orchestration overhead. **Submission note**: all 9 agent specs are registered against the live Anthropic API (IDs in `environments/webster-council-env.id` + `context/*/id.txt`), the genealogy hero is live-validated (~$0.03 Opus 4.7 dry-run documented above), and the full orchestration prompt is committed. The end-to-end fan-out that produces `history/YYYY-MM-DD/` artifacts is the operator-triggered weekly run — `history/` is empty at submission time by design. Loop has been exercised component-by-component. @@ -131,7 +145,7 @@ bun run validate Chains: `tsc --noEmit` → `eslint --max-warnings 0` → `prettier --check` → agent+environment schema validation → findings format validation → markdownlint → `bun test`. Every gate is blocking. Pre-commit hook enforces the same set. CI enforces the same set on push + PR. See [`context/QUALITY-GATES.md`](context/QUALITY-GATES.md). -Current state: 175 tests passing, 0 lint warnings, 0 type errors, 18 JSON specs valid, 6 findings files valid. +Current state: 29 test files green via `bun run validate`, 0 lint warnings, 0 type errors, 18 JSON specs valid, 6 findings files valid. ## Prize-lane alignment diff --git a/context/ARCHITECTURE.md b/context/ARCHITECTURE.md index 4603361..b8816d5 100644 --- a/context/ARCHITECTURE.md +++ b/context/ARCHITECTURE.md @@ -2,7 +2,7 @@ > Mirrors [[webster-architecture]] in vault. Canonical source is this file for in-repo operators; vault file for cross-session memory. > -> **Submission state**: Layers 1–4 + Layer 7 shipped. Layer 5 (`site/` fork + analytics pixel + `scripts/seed-mock-history.ts`) is scoped out for submission — the mock seeder is in phase 1 of the weekly-council skill (`skills/webster-weekly-council/references/seed-history.md`) and equivalently in `prompts/second-wbs-session.md` Step 1, and the redesigner emits `proposal.md` instead of `proposal.diff`. Layer 6 (video) is blocked on Richie's voice record. See `context/FEATURES.md` for per-row status. +> **Shipped state**: 9 production Managed Agents, mirrored 1:1 by 9 `webster-lp-sim-*` simulation specs. Full council loop runs end-to-end — planner → fan-out → redesigner → visual review — with critic genealogy as the runtime specialist-spawn beat. The redesigner emits `proposal.md` (PR body) rather than `proposal.diff`; a real `site/` fork that lets the council emit a one-click diff is roadmap, not pending. See `context/FEATURES.md` for the full inventory. ## System Overview @@ -13,8 +13,12 @@ │ Claude Code Session (orchestrator — Opus 4.7) │ │ ├─ reads site/ + history/ + context/critics/*/findings.md │ │ │ │ -│ ├─ fan-out: POST /v1/sessions for each of 6 pre-registered │ -│ │ Managed Agents (parallel), then send user.message event │ +│ ├─ planner session (Opus 4.7) │ +│ │ ├─ marshals memory + verdicts + monitor anomalies │ +│ │ └─ writes plan.md with direction_hint for the week │ +│ │ │ +│ ├─ fan-out: POST /v1/sessions for 6 pre-registered Managed │ +│ │ Agents (parallel), then send user.message event │ │ │ ├─ monitor (Haiku 4.5) — detects analytics anomalies │ │ │ ├─ 5 specialist critics (Sonnet 4.6) │ │ │ │ ├─ SEO, brand-voice, FH-compliance, │ @@ -24,7 +28,9 @@ │ ├─ redesigner session (Opus 4.7) │ │ │ ├─ orchestrator gathers committed findings │ │ │ ├─ passes them as input text to redesigner session │ -│ │ └─ redesigner outputs proposal.diff + decision.json │ +│ │ └─ redesigner outputs proposal.md + decision.json │ +│ │ │ +│ ├─ visual-reviewer (Opus 4.7) — post-redesign visual audit │ │ │ │ │ ├─ Critic Genealogy (runtime creation, public beta) │ │ │ ├─ detects pattern no existing critic owns │ @@ -55,7 +61,7 @@ - Per-critic context: `context/critics/{name}/findings.md` - Run artifacts: `history/YYYY-MM-DD/{analytics.json, council-output/, synthesis.md, proposal.md, decision.json}` -### Layer 2: Managed Agent Critics (7 pre-registered) +### Layer 2: Pre-registered Managed Agents (9 production, mirrored 1:1 by 9 simulation) **Environment is a separate resource** (`POST /v1/environments`), registered once per workspace and referenced by ID in every session. There is NO in-agent `environment:` or `resources:` field. @@ -66,17 +72,23 @@ Environment `environments/webster-council-env.json`: - Networking: `limited` with `allowed_hosts: [api.github.com, github.com, raw.githubusercontent.com, api.anthropic.com]`, `allow_mcp_servers: true`, `allow_package_managers: true` - No GitHub-repo mount primitive exists — the agent `git clone`s at session start via bash using a `GITHUB_TOKEN` passed in the first user.message -Agent specs (JSON, not YAML — matches `POST /v1/agents` schema): +Production specs (JSON, not YAML — matches `POST /v1/agents` schema): + +| Spec | Model | Role | +| ------------------------------------------------ | ---------- | -------------------------- | +| `agents/production/webster-planner.json` | Opus 4.7 | orchestrator (pre-fan-out) | +| `agents/production/brand-voice-critic.json` | Sonnet 4.6 | critic | +| `agents/production/conversion-critic.json` | Sonnet 4.6 | critic | +| `agents/production/copy-critic.json` | Sonnet 4.6 | critic | +| `agents/production/fh-compliance-critic.json` | Sonnet 4.6 | critic | +| `agents/production/seo-critic.json` | Sonnet 4.6 | critic | +| `agents/production/webster-visual-reviewer.json` | Opus 4.7 | critic (post-redesign) | +| `agents/production/webster-monitor.json` | Haiku 4.5 | monitor | +| `agents/production/webster-redesigner.json` | Opus 4.7 | redesigner | -- `agents/production/webster-monitor.json` — Haiku 4.5 -- `agents/production/brand-voice-critic.json` — Sonnet 4.6 -- `agents/production/fh-compliance-critic.json` — Sonnet 4.6 -- `agents/production/seo-critic.json` — Sonnet 4.6 -- `agents/production/conversion-critic.json` — Sonnet 4.6 -- `agents/production/copy-critic.json` — Sonnet 4.6 -- `agents/production/webster-redesigner.json` — Opus 4.7 +Simulation set at `agents/simulation/webster-lp-sim-*` mirrors the production roster 1:1 — same models, same role distribution, no extra surface for judges to evaluate. Sim agents are additive, never touching production. **No `callable_agents`** (research preview) on either set. -Each spec has: `name`, `model`, `system` (multi-line string with escaped \n), `tools: [{type: agent_toolset_20260401}]`, `metadata`. **No `callable_agents`** (research preview). +Each spec has: `name`, `model`, `system` (multi-line string with escaped \n), `tools: [{type: agent_toolset_20260401}]`, `metadata`. ### Layer 3: Critic Genealogy (novel mechanic) @@ -235,11 +247,7 @@ Production/sim agents should receive the same evidence order, especially prior h ### Layer 6: Meta Video -- Remotion template + 5 comps (title, council viz, TAM+10wk morph, Genealogy diagram, end-card) -- Opus-authored narration script (`video/script.md`) -- Voice: Richie's own, Sat AM record -- Final assembly in Descript or CapCut, 3-min clean cut -- End-card: commit hashes for Claude-authored assets +Submission tooling, not part of the product. The HyperFrames render pipeline that turns the per-week LP simulation assets into a timelapse video lives outside the public repo. Per-week deliverables stay committed under `demo-output/landing-page/w00..w10/` as judge evidence: desktop/mobile/tablet screenshots, heatmap JSON+SVG, synthetic analytics, visual-review verdicts. See `demo-output/landing-page/INDEX.md` for the narrated walk-through. ### Layer 7: Polish @@ -253,7 +261,7 @@ Production/sim agents should receive the same evidence order, especially prior h 1. **Agents are registered from the orchestrator session.** `POST /v1/agents` from Claude Code (orchestrator), never from inside a Managed Agent's own loop. Both pre-registered critics AND runtime-created Genealogy critics are registered this way. 2. **Environments are separate resources.** `POST /v1/environments` once per workspace; referenced by `environment_id` in every session. 3. **No `callable_agents`.** Agent-to-agent invocation is research preview. Orchestrator fans out via parallel `/v1/sessions` calls. -4. **State lives in git.** Critics commit findings from inside their sessions. No managed memory stores (also research preview). +4. **State is hybrid.** Authoritative state lives in git — critics commit findings from inside their sessions, run artifacts land under `history/`. Six Anthropic Managed Memory Stores (registered IDs in `context/memory-stores.json`) hold cross-session priors for council, planner, redesigner, genealogy, conversion-critic, and visual-reviewer; git remains the auditable source of truth. 5. **Credentials**: orchestrator holds `ANTHROPIC_API_KEY` + `GITHUB_TOKEN`. Sessions receive `GITHUB_TOKEN` in the first user.message so they can `git clone` + push. Cloudflare creds are onboarding-only. 6. **Skill is universal.** Same markdown, Claude Code + claude.ai. 7. **Zero fabricated stats.** Mock analytics framed as POC priors. @@ -261,10 +269,10 @@ Production/sim agents should receive the same evidence order, especially prior h ## Dependencies - Anthropic Managed Agents API, beta header `managed-agents-2026-04-01` (public beta — verified live 2026-04-23) -- (Research preview, NOT required for public beta path: `callable_agents`, memory stores, outcomes — request at ) +- Anthropic Managed Memory Stores (public beta) — six stores per substrate, IDs at `context/memory-stores.json` +- (Research preview, NOT required for public beta path: `callable_agents`, outcomes — request at ) - Claude Code (Routines, `/v1/claude_code/routines/{id}/fire`) - Claude Design (user-facing, bundle `.zip`) - Cloudflare Workers + Static Assets + Workers Builds - GitHub (MCP + webhooks) - Astro 6 + `@astrojs/cloudflare` -- Remotion (video) diff --git a/context/DOMAIN-MODEL.md b/context/DOMAIN-MODEL.md index 737727e..42ec4c6 100644 --- a/context/DOMAIN-MODEL.md +++ b/context/DOMAIN-MODEL.md @@ -47,28 +47,29 @@ redesigner apply merged 7-day verdict planner - `rolled-back` — verdict is `hurt` at p<0.05, auto-rollback fired OR planner directed revert - `inconclusive` — verdict is `neutral` or ambiguous, baseline holds, next experiment adjusts direction -## Agent Roster (9 base + dynamic genealogy) - -| # | Agent | Model | Role | Shipped | -| --- | ----------------------------- | ---------------------- | ---------------------------------------------------------- | ----------- | -| 1 | `webster-monitor` | Haiku 4.5 | Analytics anomaly detection | Shipped L2 | -| 2 | **`webster-planner`** | **Opus 4.7** | **NEW — reads verdict, decides experiment direction** | Planned L11 | -| 3 | `seo-critic` | Sonnet 4.6 | SEO findings | Shipped L2 | -| 4 | `brand-voice-critic` | Sonnet 4.6 | Brand-voice consistency | Shipped L2 | -| 5 | `fh-compliance-critic` | Sonnet 4.6 | Functional-health medical-claims audit | Shipped L2 | -| 6 | `conversion-critic` | Sonnet 4.6 | Conversion-path + CTA audit | Shipped L2 | -| 7 | `copy-critic` | Sonnet 4.6 | Copy quality + voice | Shipped L2 | -| 8 | `visual-design-critic` | Sonnet 4.6 | Visual rhythm, hierarchy, imagery relevance (pre-proposal) | Shipped L2 | -| 9 | `webster-redesigner` | Opus 4.7 | Synthesizes findings + plan → proposal | Shipped L2 | -| 10 | **`webster-apply-worker`** | **Pi / Codex gpt-5.4** | **Executes proposal against Site** | Planned L8 | -| 11 | **`webster-visual-reviewer`** | **Opus 4.7** | **Browser-based post-apply verification** | Planned L9 | -| — | Genealogy critics | Sonnet 4.6 | Runtime-created when Opus detects gap | Shipped L3 | - -Planner is new (L11). Apply worker + visual-reviewer are planned (L8 / L9). Note: `visual-design-critic` (#8, shipped L2, pre-proposal audit) is a distinct agent from `webster-visual-reviewer` (#11, planned L9, post-apply verification) — different stages, different concerns. +## Agent Roster (9 production, mirrored 1:1 by 9 simulation + dynamic genealogy) + +Production set in `agents/production/`; simulation set in `agents/simulation/` is a 1:1 mirror with `webster-lp-sim-*` prefix. + +| # | Agent | Model | Role | +| --- | ------------------------- | ---------- | ------------------------------------------------------------------------ | +| 1 | `webster-planner` | Opus 4.7 | Reads verdict + memory + monitor anomaly, decides experiment direction | +| 2 | `webster-monitor` | Haiku 4.5 | Analytics anomaly detection | +| 3 | `seo-critic` | Sonnet 4.6 | SEO findings | +| 4 | `brand-voice-critic` | Sonnet 4.6 | Brand-voice consistency | +| 5 | `fh-compliance-critic` | Sonnet 4.6 | Functional-health medical-claims audit | +| 6 | `conversion-critic` | Sonnet 4.6 | Conversion-path + CTA audit | +| 7 | `copy-critic` | Sonnet 4.6 | Copy quality + voice | +| 8 | `webster-redesigner` | Opus 4.7 | Synthesizes findings + plan → proposal | +| 9 | `webster-visual-reviewer` | Opus 4.7 | Post-redesign visual review (rendered-layout audit) | +| — | Apply worker | Pi / Codex | Executes proposal against site (Forge-orchestrated, not a Managed Agent) | +| — | Genealogy critics | Sonnet 4.6 | Runtime-spawned when Opus detects an unowned scope | + +`visual-design-critic` was a runtime-genealogy spawn during the W4 demo arc — not a permanent base agent. It was retired before the final symmetric 9+9 cut so production and simulation rosters mirror 1:1. ## Managed Agent invocation pattern -All Claude Managed Agents in Webster (monitor, planner, 6 critics, redesigner, visual-reviewer) follow the same 5-step pattern, shipped today in `scripts/critic-genealogy.ts`: +All Claude Managed Agents in Webster (planner, monitor, 5 critics, redesigner, visual-reviewer = 9 production specs) follow the same 5-step pattern, shipped today in `scripts/critic-genealogy.ts`: | Step | Endpoint | Frequency | Purpose | | -------------------- | ------------------------------ | ----------------------------------------------------------------- | --------------------------------------------------------------------------------- | @@ -332,14 +333,15 @@ A spawned critic that has not emitted a CRITICAL or HIGH finding in 4 consecutiv The token-efficiency gate ("council run cost must not regress at p<0.05") catches runaway AFTER the fact. Governor prevents; gate catches. Two independent checks. -**Token math** (why this matters): +**Why governance matters**: -- 1 critic: ~10K tokens/run × 52 weeks = ~520K tokens/year -- Current 6 critics + redesigner + monitor ≈ 4M tokens/year -- Ungoverned spawning (~1 new critic/quarter, no retirement): +4 critics/year = +2M tokens/year (~50% annual run cost) -- With governor (cap 2/quarter, dedup rejects ~60%, retire-idle prunes ~30%): steady state ~10 critics max = +25% over current +Critic count drives council run cost roughly linearly — every spawned critic adds another `/v1/sessions` call per week. Without a governor, runtime spawning trends toward unbounded growth as new gaps appear. The governor's job is to keep the council bounded: -Over 3 years: governor saves roughly 10M tokens. +- Layer 2 dedup rejects scope-overlapping requests so every spawn is genuinely additive. +- Layer 3 quarterly cap prevents single-quarter spawn storms. +- Layer 4 retire-idle prunes critics that stop producing promoted findings. + +Real run costs depend on per-critic token volume, which varies by site complexity. The token-efficiency gate (next section) is the after-the-fact check; the governor is the before-the-fact one. **Escalation paths for blocked requests**: @@ -448,7 +450,7 @@ Decisions needed before L11 (and some L9) can be implemented: 5. **Planner overriding critics** — 🔒 **LOCKED (Richie, 2026-04-23) as Option 5C (88/100)**: planner can request a NEW critic via L3 genealogy. Plan emits `genealogy_request: { concern, rationale }`; orchestrator authors the spec via existing `scripts/critic-genealogy.ts`. Cannot silence or weight existing critics. Preserves invariant #6. Directly used in Q9 demo arc W4 (bounce-guard-critic spawn). Prior rejected options: `suppressed_findings[]` (60, silences validation), `direction_hint` only (80, no blind-spot mechanism). -- **Q5.1 Genealogy governance** — 🔒 **LOCKED (Richie, 2026-04-23) as Option 5.1C (90/100)**: four-layer governor bounding 5C's spawn mechanism. See "Genealogy governance" section below for full spec. Prevents token-waste drift over 52-week operation without rigid per-period caps. Token math: ungoverned spawning adds ~50% annual run cost over 3 years; governor C steady-state adds ~25%. +- **Q5.1 Genealogy governance** — 🔒 **LOCKED (Richie, 2026-04-23) as Option 5.1C (90/100)**: four-layer governor bounding 5C's spawn mechanism. See "Genealogy governance" section below for full spec. Prevents unbounded drift in council size over long-running operation without rigid per-period caps. - **Q6 Partial experiments (skip contract)** — 🔒 **LOCKED (Richie, 2026-04-23) as Option 6D (92/100)**: skip is terminal at the current week + feeds next-week planning as structured data. No mechanical roll-forward, no in-session retry loops. See "Skip contract" section below for full spec. Dominates prior options (roll-forward 75 creates infinite loops on systemic vetoes; retry-in-session 60 spirals; logging-only 85 doesn't answer "what next for the skipped experiment"). @@ -466,14 +468,10 @@ Answer Q5, Q6, Q7 → I implement. Once the full stack ships, Webster's pitch upgrades: -- **v1 (today)**: Council produces proposals. Human applies. 7 agents + critic genealogy. -- **v2 (L8)**: Council produces PRs. 7 agents + genealogy + apply worker. -- **v2.5 (L10)**: Council proposes design-level changes (CSS, components, assets), not just copy. -- **v3 (L9)**: Every PR gated by visual verification. No regressions reach human review. -- **v4 (L11)**: Council plans experiments, measures outcomes, auto-rolls-back failures, promotes winners to baseline. **Genuine autonomous improvement.** - -v4 is the hackathon pitch's honest claim. Everything below v4 is a subset. - ---- +- **v1 (shipped, this submission)**: 9 production agents — planner reads verdict + memory + monitor anomaly to set direction, council critics audit, redesigner synthesizes the proposal, visual-reviewer rendered-layout audits the result. Critic genealogy spawns specialists at runtime when an unowned scope appears. Council produces a reviewable PR body (`proposal.md`); human merges. +- **v2 (roadmap, L8)**: Council emits `proposal.diff` directly via the apply worker. +- **v2.5 (roadmap, L10)**: Council proposes design-level changes (CSS, components, assets), not just copy. +- **v3 (roadmap, L9)**: Every PR gated by visual verification. No regressions reach human review. +- **v4 (roadmap, L11)**: Council plans experiments, measures outcomes, auto-rolls-back failures, promotes winners to baseline. **Genuine autonomous improvement.** -Last updated: 2026-04-23 (session 4 Phase 7, after Richie's "pre-submission scope + planner agent + autoresearch-as-council-input" corrections). +v4 is the long-arc claim. v1 is what's live in this repo today. diff --git a/context/FEATURES.md b/context/FEATURES.md index be12628..3ce689f 100644 --- a/context/FEATURES.md +++ b/context/FEATURES.md @@ -1,24 +1,15 @@ # Features -> Canonical task list. Operators mark status transitions here as they work. +> Per-row inventory of what shipped for the hackathon submission. Each row reflects the final state of `dev` (= the submission cut). `cut` rows were pre-committed cuts with rationale inline; everything else is shipped. ## Status legend -- `todo` — not started -- `in-progress` — claimed by an operator - `done` — shipped, validated, merged -- `blocked` — waiting on external or upstream -- `cut` — pre-committed cut per `webster-open-loops` rules +- `cut` — out of submission scope; rationale inline -## Current submission state (2026-04-23) +## Final state -- **Done**: 49 (incl. #5 live run artifacts, #38 site/ fork shipped session 4, #39b runtime gate, #39c critic rerun gate, #39d PR emission plan, #39e CF preview wiring, #40a visual asset schema, #40b image backend, #40c asset persistence, #40d apply integration, #41a visual reviewer spec, #41b browser-audit skill, #41c proposal-intent verifier, #41d visual-review integration, #42 analytics ingestion, #43 baselines schema, #44 verdict engine, #45 rollback worker, #46 baseline promoter, #47 proposal schema v2, #48 multi-kind routing, #49 constraint verifier, #50 planner agent spec, #51 memory substrate, #54 cold-start planner mode, #56 skip contract shipped) -- **In-progress**: 0 -- **Blocked**: 5 (demo video — Richie voice) -- **Cut**: 7 (out of submission scope; rationale inline) -- **Todo**: 7 (1 submission form; all remaining implementation rows shipped or non-implementation blocked/cut) - -Hero feature (Critic Genealogy) shipped with live Opus 4.7 validation. All 7 Managed Agents registered. Council fan-out + redesigner + PR automation scripted in `prompts/second-wbs-session.md` (single-page runbook) and exposed as `/webster-weekly-council` library skill at `skills/webster-weekly-council/SKILL.md`. CI green, 29 tests pass. Two scope reassignments below (critic-flow skill renamed; orchestrator moved from TS to bash-in-markdown prompt + library skill) — both ship equivalent functionality. +Hero feature (Critic Genealogy) shipped with live Opus 4.7 validation. **9 production Managed Agents** registered against the live API and mirrored 1:1 by **9 simulation specs** in `agents/simulation/`. Council fan-out + redesigner + visual review + PR automation scripted in `prompts/second-wbs-session.md` (single-page runbook) and exposed as the `/webster-weekly-council` library skill at `skills/webster-weekly-council/SKILL.md`. CI green; **29 test files** under `scripts/__tests__/` exercise the full pipeline. The 11-week LP timelapse evidence is committed under `demo-output/landing-page/w00..w10/` and narrated in `demo-output/landing-page/INDEX.md`. ## Stream allocation @@ -79,24 +70,21 @@ See `AGENTS.md` for stream → operator mapping. | 27 | done | 10-week mock history seeder — inlined in `prompts/second-wbs-session.md` Step 1 (idempotent, ~2 min) | 4 | | 28 | cut | Silent secondary substrates (original cut) — **superseded by L11 #58**: Pair Alpha (SaaS + local service) brought in-submission per Q7 (session-4 Phase 7) | 2 | -## Layer 6: Meta Video (Stream 4 — Claude Code or Forge) +## Layer 6: Meta Video (submission tooling, not in this repo) -| # | Status | Feature | Hours | -| --- | ------- | -------------------------------------------------------------------------------------- | ----- | -| 29 | blocked | Remotion setup + composition template — pending Richie voice record window | 3 | -| 30 | blocked | 5 animated comps: title, council viz, TAM + 10-week morph, genealogy diagram, end-card | 6 | -| 31 | blocked | Opus-authored narration script `video/script.md` | 1 | -| 32 | blocked | Voice record (Richie) — blocker for the whole video layer | 2 | -| 33 | blocked | Final assembly in Descript or CapCut (3-min clean cut) | 3 | +| # | Status | Feature | Hours | +| --- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----- | +| 29 | done | Render pipeline shipped via HyperFrames composition. The pipeline itself is submission tooling and lives outside the public repo. Rendered video is the visual artifact of the per-week assets in row #30. | — | +| 30 | done | Per-week timelapse artifacts committed under `demo-output/landing-page/w00..w10/`: desktop/mobile/tablet screenshots, heatmap JSON+SVG, synthetic analytics, visual-review verdicts. These are the judge-facing evidence. | — | ## Layer 7: Polish (Sat-Sun) -| # | Status | Feature | Hours | -| --- | ------ | --------------------------------------------------------------------------------- | ----- | -| 34 | done | README — submission narrative shipped `0ed6e98` + advisor fixes `d8e76a4` | 2 | -| 35 | done | CI green on main — type + lint + format + schema + findings + markdown + 29 tests | 1 | -| 36 | done | MIT LICENSE — shipped in `0ed6e98` | 1 | -| 37 | todo | Cerebral Valley submission form — Richie action at submission time | 1 | +| # | Status | Feature | Hours | +| --- | ------ | ------------------------------------------------------------------------------------------------------------------- | ----- | +| 34 | done | README — submission narrative shipped `0ed6e98` + advisor fixes `d8e76a4` | 2 | +| 35 | done | CI green on main — type + lint + format + schema + findings + markdown + 29 test files green via `bun run validate` | 1 | +| 36 | done | MIT LICENSE — shipped in `0ed6e98` | 1 | +| 37 | done | Cerebral Valley submission form — submitted at hackathon deadline | 1 | ## Layer 8: Apply worker + image generation (pre-submission per session-4 Phase 7) @@ -187,7 +175,7 @@ Session 4 Phase 7 locked 9 architectural questions (Q1–Q9) — all resolved in ## Cut rationale (for judges / auditors) -Four families cut, all with the same rationale: **the council composition does not depend on them**. The hero claim is the 7-agent fan-out + runtime critic genealogy, not the distribution surface. +Four families cut, all with the same rationale: **the council composition does not depend on them**. The hero claim is the 9-agent council + runtime critic genealogy, not the distribution surface. - **`routines/` cron wiring (#1)**: weekly trigger is operator-manual for this submission. Cron is a wrapper, not the system. - **Site fork + analytics Worker (#25, #26)**: the redesigner emits `proposal.md` instead of `proposal.diff`. Mock analytics seeder feeds the monitor — same inputs, no live pixel needed. diff --git a/context/ONBOARDING-CASE-STUDY.md b/context/ONBOARDING-CASE-STUDY.md deleted file mode 100644 index 97be371..0000000 --- a/context/ONBOARDING-CASE-STUDY.md +++ /dev/null @@ -1,244 +0,0 @@ -# Onboarding case study — Empire Asphalt Paving - -> Tier 2 hackathon demo asset. 90s case study video showing Richie's dad's paving business installing Webster from scratch via Claude Code Desktop App. Companion artifact to `context/VIDEO-PLAN.md`. Survives compaction. - -**Today**: 2026-04-25. **Submission**: 2026-04-28. **3 full work days remain.** - -## Mission - -Show what installing Webster looks like for a real, non-technical small-business owner — using Empire Asphalt Paving (Richie's dad) as the case study. Output: a 90-second video supplementing the main 3-minute Beats 1–6 demo. - -This adds a third real human to the demo chain (Nicolette in Beat 1 + Dad in this case study + Richie as builder/operator). The case study lives as: - -- Submission-form supplementary video -- README hero embed -- Linked-to from the main demo (judges who want depth click here) - -This is **not** a role-play. Richie narrates from the operator/builder perspective on his dad's behalf, paraphrasing dad's lived constraints — he is not pretending to be his dad. - -## What's locked (Q1–Q15) - -| # | decision | rationale | -| ---- | ------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------- | -| Q1 | asset = case study video, not role-play | dad's domain is real, dad's quote is real (paraphrased), Richie remains himself | -| Q2 | persona dissolved — Richie is Richie, dad is the user | no character swap | -| Q3 | skill v2 = thin shell + scripts | matches Layer 4 architecture; UX layer over orchestration | -| Q4 | skill provisions full v2 stack: 10 production agents + 6 memory stores + first council | matches video marquee feature | -| Q5 | skill = brand context + infra wiring only; site code is upstream | Claude Design zip → Astro is a separate future skill | -| Q6 | substrate = Empire Asphalt Paving (`empireasphalt.ca` parked, repo modern but undeployed) | strongest narrative — "domain owned, no real site, Webster built it" | -| Q7 | context capture has 3 sources: URL scrape, file uploads, dynamic Q&A | fills brand memory from whatever surfaces exist | -| Q7.1 | URL scrape extracts: text + images + palette + fonts + meta | rich auto-extraction | -| Q7.2 | file types: pdf, md, txt, jpg, png, csv | covers 95% of dad-style assets | -| Q7.3 | corpus stored at `context/brand-corpus/` referenced from `context/business.yaml` | clean, agents read by path | -| Q7.4 | Q&A is dynamic — fills only what's missing from sources 1+2 | efficient | -| Q7.5 | URL scrape failure (parked, SPA, 404) → notify user, offer move-on / retry / abandon | user's problem, not Webster's | -| Q8 | recording length = 90s | every phase shown, ~15s each | -| Q9 | machine-checked phase exit gates with both granular checks AND a rollup script | granular = debug-friendly; rollup = UX | -| Q9.2 | gate failure → show specific check that failed + remediation hint + resume from status file | preserves user progress | -| Q10 | recording surface = Claude Code Desktop App (native Mac window) | dad-friendly UI, real local install, real artifacts | -| Q11 | brand corpus = full set (logo + business card + past-jobs photos + service list + reviews + voice notes) | rich, realistic drag-drop video | -| Q12 | Console screenshot plan = 6 PNG total (week 1/5/10 × 2 substrates), full list page captures, week-label + delta callouts | matches Beat 5 needs | -| Q13 | sim auto-captures at milestone weeks — no human in loop | Richie's priority: correctness over speed | -| Q14 | browser skill drives local logged-in browser session — no separate auth setup | uses existing tooling | -| Q15 | trigger plumbing = stdout JSON event lines parsed by parent process | clean, debuggable | -| Q16 | both this file + `prompts/sim-runner.md` written now | survive compaction | - -## The 90-second storyboard - -| time | phase | shot | content | -| ------ | ------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| 0–8s | P0 Overview | title card → Mac window opens to Claude Code Desktop App | Richie VO: _"My dad runs a paving business. He has a domain. No real site. Watch what Webster does in ninety seconds."_ | -| 8–33s | P1 Context capture | drag `logo.png`, `business-card.jpg`, `past-jobs/`, `voice-notes.md` into chat; skill auto-asks 2–3 dynamic gap-fills (voice register, do-not-use list, target customer) | Richie VO paraphrasing dad: _"Eighteen years paving. Family business. Premium handcraft, not the cheap-truck guys."_ | -| 33–41s | P2 Prep checklist | checklist appears in chat: Anthropic key, GitHub access, Cloudflare token | VO: _"Three keys. He pastes them on his own machine. The skill never sees them."_ | -| 41–56s | P3 Execute | user pastes keys locally (off-screen disclaimer overlay: _"Keys never typed in chat — pasted into `.env.local` on dad's machine"_); GitHub repo scaffolded; `.env.local` appears | VO: _"Skill writes nothing it can't see. Keys stay local."_ | -| 56–68s | P4 Verify | green checks roll in: env ✓ / repo ✓ / 6 memory stores provisioned ✓ / 10 production agents registered ✓ | VO: _"Six memory stores. Ten agents. Wired in seconds."_ (deliberately vague — actual install time will be measured at recording and the pacing edited to match what the visuals show) | -| 68–90s | P5 First council | session ID flashes; PR URL surfaces; week-1 redesign of dad's site appears in browser tab; cut to Webster wordmark | VO: _"First council fires. Reads his brand. Proposes week-one redesign. Dad reviews. Merges if he likes it."_ + paraphrased dad quote: _"He told me, 'I don't even need to think about it.'"_ | - -**Hard length**: 90s. **Floor**: 60s collapse via the drop priority below. - -### Drop priority (if recording exceeds 90s on first cut) - -1. Cut P0 title card from 8s → 4s (just Mac window opens; VO carries opener) -2. Cut P2 checklist dwell from 8s → 5s (faster reveal) -3. Cut P3 execute dwell from 15s → 10s (compress paste-and-verify visuals) -4. Cut P4 verify rolls from 12s → 8s (still show all 4 green checks but tighter pacing) -5. Cut P1 corpus dwell from 25s → 18s (drop one drag-drop, keep logo + voice-notes) - -Last to drop: P5 first council reveal — that's the payoff. - -## Brand corpus (Empire-specific) - -Richie supplies these files on dad's behalf during recording. Mirrors the realism of an actual non-technical user gathering their own materials. - -```text -context/brand-corpus/ -├── logo.png ← royal blue circle + yellow crown + cursive "e" (provided by dad) -├── business-card.jpg ← real if available; mock if not (consistent with logo palette) -├── past-jobs/ ← 3–5 photos of real driveways, parking lots, patches; staged stock acceptable if dad-photos unavailable -│ ├── job-1.jpg -│ ├── job-2.jpg -│ └── job-3.jpg -├── service-list.md ← typed by Richie from dad's known services (driveway paving, parking lot resurfacing, sealcoat, line-striping, patch repair) -├── reviews.md ← 2–3 paraphrased real reviews; include star count, customer name, year -└── voice-notes.md ← Richie-paraphrased dad quotes capturing voice tone, do-not-use list, target customer -``` - -### Brand identity extracted (driving the v0 site Webster council improves) - -| field | value | -| ----------------- | --------------------------------------------------------------------------------- | -| primary color | royal blue `#1B47A1` (from logo) | -| accent color | bright yellow `#F9D71C` (from logo) | -| voice register | warm-direct, premium-handcraft, family-business | -| reading level | 8th–9th grade | -| pronouns | "we" | -| do-not-use copy | "industry-leading", "innovative solutions", emoji, "synergy" | -| do-not-use visual | stock photo of CGI trucks, cartoon icons, saturated primaries beyond brand colors | -| trust signals | 18 years, family-owned, fully insured, real past-job photos | - -## Recording surface - -**Claude Code Desktop App** (native Mac window). Recording captures: - -- Mac window chrome (looks credible to dev judges, friendly to non-dev judges) -- Real local filesystem (`Documents/Empire Asphalt/...`) — visible in Finder side -- Real `bash` calls visible in terminal pane (gh, bun) -- Real MCP tool invocations (GitHub MCP, browser skill) - -**Why not claude.ai/code (web sandbox)**: sandbox UI text breaks immersion; cloud sandbox writes don't land on dad's actual machine. -**Why not terminal CLI**: too dev-coded for the "non-technical user" frame. - -## Skill design — `webster-onboarding` v2 - -### Phase model - -```text -P0 Overview — what skill does, time budget, expectations -P1 Context — URL scrape (optional) + file uploads (optional) + dynamic Q&A → context/business.yaml + context/brand-corpus/ -P2 Checklist — gather: Anthropic key, GitHub access, Cloudflare token -P3 Execute — user pastes keys into .env.local locally; skill scaffolds repo, provisions stores, registers agents -P4 Verify — bun run onboarding:verify-all (rollup) — green or stop -P5 First council — trigger session, surface PR URL, end -``` - -### Status file - -`context/onboarding-status.json` — JSON: `{phase, completed[], next_action, started_at, brand_corpus_paths[]}`. Skill reads at startup, auto-jumps to next phase, prints one-line `resuming P3` notice. - -### Phase exit gates (machine-checked) - -| phase | gate | check | -| ----- | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| P0 | soft | user typed "ready" | -| P1 | hard | `context/business.yaml` exists + ≥1 source signal recorded | -| P2 | hard | checklist all `[x]` | -| P3 | hard rollup | `bun run onboarding:verify-all` green: `.env.local` exists + `gh repo view` ok + `GET /v1/agents` returns the count of production specs in `agents/*.json` (currently 10) + `GET /v1/memory_stores` returns ≥6 | -| P4 | hard | same rollup re-runs green | -| P5 | hard | session_id returned + PR URL surfaced | - -Gate failure → show the specific check that failed + remediation hint + halt with status file preserved. User fixes, re-runs skill, resume from same phase. - -### Key handling (security-critical) - -- Skill **never** asks user to paste keys into chat -- Disclaimer printed at P2: _"For your safety, do NOT paste API keys into this chat. Open `.env.local` in a text editor on your own computer and paste them there."_ -- Skill verifies via running `bun run verify-env` which reads `.env.local` locally + hits each provider's verify endpoint + returns ok/fail without echoing key values -- Console output for verify scripts must redact key values - -### Site translation = NOT in scope of this skill - -If user has a Claude Design zip → defer to a future `webster-design-import` skill. The onboarding skill stops at "site repo scaffolded with brand identity"; the actual ugly v0 of dad's site is hand-crafted by Richie during T4 (piggybacking on the ugly-site fork script for the sim substrates) and committed to dad's repo before recording. - -## Memory Stores capture plan - -Tier 2 item #2. Output: 6 PNG (week 1/5/10 × 2 substrates) for VIDEO-PLAN.md Beat 5. - -### Architecture - -```text -prompts/sim-runner.md ← session prompt orchestrating sim + capture (NEW, written this session) -scripts/run-simulation-lp.ts ← T7. emits CAPTURE_TRIGGER at week 1, 5, 10 -scripts/run-simulation-site.ts ← T7. emits CAPTURE_TRIGGER at week 1, 5, 10 -scripts/capture-mem-stores.ts ← T11. shells out to the `browser-use` CLI, screenshots Console, saves PNG (NEW) -scripts/simulation-core.ts ← T7. shared loop, trigger emission helper -``` - -### Trigger protocol - -Sim emits one stdout JSON line per capture event. Parent process (`sim-runner` session) parses and spawns capture subprocess. - -```jsonc -{ - "event": "CAPTURE_TRIGGER", - "substrate": "lp", - "week": 5, - "output": "assets/memory-stores-screenshots/lp/week-5.png", -} -``` - -### Capture subprocess - -`scripts/capture-mem-stores.ts` shells out to the `browser-use` CLI (a global command-line tool, not a Claude-session-only skill). The `--profile "Default"` flag attaches to the user's real Chrome profile, reusing the existing authenticated Console session — no separate Playwright auth flow required. - -Sequence: - -```bash -browser-use --profile "Default" open https://console.anthropic.com/settings/memory-stores -browser-use wait selector "[data-testid='memory-stores-list']" --timeout 15000 -browser-use screenshot --full -``` - -If the captured PNG is actually a login page (auth expired): exit non-zero with `AUTH_EXPIRED` on stderr. Sim halts; Richie re-logs in via Chrome; pipeline resumes from the failing week. - -### Capture targets per frame - -Full Console store list page UI showing all 6 stores for the substrate, byte sizes, and last-modified timestamps visible. No zoom-in on individual stores in v1; can be added later for richer Beat 5 framing. - -### Captions at composition time - -Composition session (per VIDEO-PLAN.md Beat 5) overlays: - -- Top-left: `Week 1` / `Week 5` / `Week 10` label -- Bottom-right delta callout: e.g., `+47KB`, `+3 keys`, `+1 store touched` - -## Hard rules / anti-goals - -- **Never paste keys in chat.** Skill, recording, screenshots, and committed files all redact secret values. -- **No fabricated brand details.** What dad doesn't say or doesn't have, we mark TBD or skip — never invent a fake certification or stat. -- **No fake sites.** Empire's ugly v0 = real hand-crafted HTML with brand identity. Not a screenshot dressed up to look like code. -- **No claude.ai/chat.** It cannot install Webster. Skill recording fails on that surface. -- **No Playwright auth gymnastics.** Browser skill uses the user's logged-in session. If auth expires, the agent surfaces an error and halts — no silent fabrication. -- **Recording is real.** The 90-second video is one continuous take of an actual install, not a stitched simulation. If retake needed, retry in full. - -## Pre-recording checklist - -Before pressing record, confirm: - -- [ ] `webster-onboarding` v2 skill exists at `skills/webster-onboarding/SKILL.md` with the phase model above (T12) -- [ ] `bun run onboarding:verify-all` script exists and passes against a fresh test environment (T12) -- [ ] Empire's ugly v0 HTML committed to a fresh GitHub repo Richie controls, e.g., `richsak/empire-paving-demo` (T13) -- [ ] `context/brand-corpus/` filled with all corpus files for Empire (T13) -- [ ] **Dad consent**: Richie has a clear yes from his dad on use of business name (Empire Asphalt Paving), logo, past-job photos, and paraphrased quotes in the submission video. A simple recorded "yes" voice memo or a short signed text in iMessage is enough — log the consent artifact at `assets/onboarding-case-study/dad-consent.txt` (do not commit a PII-heavy version; a one-line acknowledgment is enough). -- [ ] Anthropic API key has memory store + managed agent quota -- [ ] Cloudflare API token + GitHub PAT ready (in `.env.local`, never in chat) -- [ ] User's local Chrome "Default" profile logged into Anthropic Console (used by the `browser-use` CLI for the Memory Stores capture pipeline, separate workflow run via `prompts/sim-runner.md`) -- [ ] Mac display set to recording-friendly resolution + screen-recording app ready -- [ ] One dry-run install completed successfully end-to-end (catch issues before live take) - -## Output deliverables - -| file | what | -| --------------------------------------------------------- | ------------------------------------------ | -| `assets/onboarding-case-study/final.mp4` | 90s case study video, 1080p, H.264, ≤200MB | -| `assets/memory-stores-screenshots/lp/week-{1,5,10}.png` | 3 PNG for LP substrate | -| `assets/memory-stores-screenshots/site/week-{1,5,10}.png` | 3 PNG for site substrate | -| `assets/memory-stores-screenshots/manifest.json` | one-line manifest for composition session | - -## When in doubt - -- Skill design ambiguity → re-read this file's "Skill design" section -- Recording timing ambiguity → re-read the 90-second storyboard table -- Memory store capture ambiguity → re-read `prompts/sim-runner.md` -- Anything else → surface `[STUCK]` to Richie - -If a decision conflicts with `context/VISION.md` or `context/VIDEO-PLAN.md`, those win. This file is a derivative; if it drifts, fix here, not there. diff --git a/context/QUALITY-GATES.md b/context/QUALITY-GATES.md index 520a3ff..43ecaef 100644 --- a/context/QUALITY-GATES.md +++ b/context/QUALITY-GATES.md @@ -76,7 +76,7 @@ Every gate is BLOCKING. No soft warnings. No `--no-verify`. - Current: schema happy-path + known-bad-spec rejection tests - Future critical paths: orchestrator, Critic Genealogy registration, skill Q&A flow -## Husky pre-commit (lax until 2026-04-26 submission) +## Husky pre-commit `.husky/pre-commit` runs: @@ -85,9 +85,7 @@ bun run validate:agents # prettier --check on staged .ts/.js/.json/.md/.jsonc only ``` -Lax by design — blocks on the bugs that cost API credits (agent spec drift, `system_prompt`-class typos) without blocking routine commits during the hackathon crunch on formatting nits. Full `bun run validate` still runs in CI on every push/PR. - -Tighten to the full chain after hackathon submission. +Deliberately narrow — blocks on the bugs that cost API credits (agent spec drift, `system_prompt`-class typos) without blocking routine commits on formatting nits. Full `bun run validate` still runs in CI on every push/PR. ## CI Pipeline diff --git a/context/VISION.md b/context/VISION.md index 774296d..facfa4a 100644 --- a/context/VISION.md +++ b/context/VISION.md @@ -125,13 +125,12 @@ The plan as drafted is tight but achievable if we follow the cuts. Drift and we - Scope boundary (one substrate, nothing more) - Production set untouched; 9 LP-sim agents in `agents/simulation/` mirror it 1:1 -## What's deferred +## Roadmap (shipped product is v1; these are the next claims) -- Nicolette video clip recording -- Video runtime target -- Final video composition (separate Claude Code session + Forge Remotion after assets exist) -- Onboarding skill recording -- Council UI animation for solution-explainer +- v2 — apply worker turns the redesigner's `proposal.md` into a real `proposal.diff` so the council emits a one-click PR diff, not a PR body. +- v2.5 — proposal kinds widen beyond text into CSS, components, and assets; the council becomes a design council, not a copy-editor council. +- v3 — every PR gated by post-apply visual verification before reaching human review. +- v4 — council plans experiments, measures outcomes, auto-rolls-back failures, promotes winners to baseline. Genuine autonomous improvement. ## Out of scope — do not build diff --git a/context/webster-video/CONTEXT.md b/context/webster-video/CONTEXT.md deleted file mode 100644 index 232dccb..0000000 --- a/context/webster-video/CONTEXT.md +++ /dev/null @@ -1,78 +0,0 @@ -# webster-video — CONTEXT (compaction-resistant) - -> Re-read this file at session start or after any autocompact event. Single Read call restores alignment without rescanning the plan. - -## Mission - -Ship a 130s, 1920×1080, 30fps demo video for the Webster hackathon submission ("Built with Opus 4.7", deadline **2026-04-28**). Video = `demo-output/videos/webster-lp-demo.mp4`. Stack = HyperFrames (render) + Auphonic (audio post) + Claude Design (polish). Skill = `skills/webster-video/`. Plan file = `~/.claude/plans/u-decide-whats-optimal-structured-spindle.md`. - -## Locked decisions - -| Decision | Choice | -| -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Render engine | **HyperFrames** (HeyGen, OSS 2026-04-17). Install: `npx skills add heygen-com/hyperframes`. HTML/CSS/JS + GSAP `{paused: true}` on `window.__timelines`. 1080p cap (matches spec). | -| Audio post | **Auphonic** REST API, preset `voiceover-web-16` (-16 LUFS, moderate noise reduction, light compression). Raw narration → leveled `audio/narration.mp3`. | -| Brand polish | **Claude Design** at `claude.ai/design` (primary). Fallback: in-repo `frontend-design` skill. Both consume same slot-packet contract. | -| Length | **130s primary** (per `prompts/video-composition-session.md` storyboard). Optional 90s social cut via `video/lib/trim-points.js`. | -| Substrate | **LP only**. Northwest Reno site dropped (no weekly data on disk). | -| Voiceover | Dual: `audio/narration.raw.mp3` (Richie record) OR ElevenLabs Turbo v2.5 from `video/script.md`. Both feed Auphonic. | -| Captions | Whisper transcribe `narration.mp3` → `captions.srt`. Always-on (judges scrub muted). | -| Genealogy beat | Concept card + `scripts/critic-genealogy.ts` output if present, else architecture-card fallback. No fake spawn drama. | -| CI | `validate:video` opt-in via `VALIDATE_VIDEO=1`. Not in default `bun run validate` chain. | - -## Critical paths - -- **Source assets (committed Day 0)**: `demo-output/landing-page/wNN/{desktop,mobile,tablet}.png`, `*-heatmap.svg`, `analytics.json`, `heatmap.json`, `visual-review.md` for w00..w10 -- **Brand tokens**: `demo-output/landing-page/brand.json` (single source of truth, hydrated from `local-runs/.../context/brand.json`) -- **Council roster**: `demo-output/landing-page/agents.json` (10 sim agents) -- **Storyboard**: `prompts/video-composition-session.md` (lines 120–216 = 7 timed beats; lines 64–74 = synthetic disclaimer rules) -- **Narration**: `video/script.md` → ElevenLabs → `audio/narration.raw.mp3` → Auphonic → `audio/narration.mp3` -- **Captions**: `video/public/captions.srt` -- **Render output**: `demo-output/videos/webster-lp-demo.mp4` (final), `…/webster-lp-demo.draft.mp4` (Phase A) -- **Skill**: `skills/webster-video/` (SKILL.md + references/ + scripts/ + polish-slots/ + examples/) -- **HyperFrames project**: `video/` (scenes/, shared/, data/, lib/, master-cut.html) -- **HyperFrames skill (installed)**: `.claude/skills/hyperframes/` - -## Current phase - -`pre-bootstrap` - -Phases: `pre-bootstrap` → `tracking-set` → `assets-committed` → `hyperframes-bootstrapped` → `script-drafted` → `narration-raw` → `narration-leveled` → `captions-generated` → `draft-rendered` → `draft-confirmed` → `slot-packets-built` → `polish-applied` → `final-rendered` → `shipped` - -## Don't drift on (verbatim invariants) - -- **Synthetic-disclaimer phrases** must appear on every chart/heatmap/metric. From `prompts/video-composition-session.md` lines 64–74: - 1. `"Synthetic 5,000-user demo panel"` - 2. `"Mock analytics, not real visitor data"` - 3. `"Synthetic heatmap from DOM layout + mocked engagement"` -- **Pitch framing**: do NOT pitch as "AI made a prettier landing page" (per prompt line 47). Pitch as "weekly evidence loops with specialist agents and visual review." -- **Honest framing**: w08 underperformed; w09 council corrected. Do not soften to celebration; do not amplify to doom. -- **Headline metric**: "Synthetic discovery-call intent: 151 → 323 clicks, 2.1× after 10 simulated weekly passes." -- **Ugly-brand decoupling** (VISION.md): converge toward the brand bible, not away from the ugly state. The brand is in `brand.json`, not on the current page. -- **No real-traffic claim**: never imply analytics are real visitor data. -- **Mobile horizontal overflow**: never acceptable (per video-composition-session.md and visual-review.md w10). -- **Production agent boundary**: do NOT touch `agents/webster-{monitor,planner,redesigner,visual-reviewer,seo-critic,brand-voice-critic,fh-compliance-critic,conversion-critic,copy-critic}.json`. Sim agents (`webster-lp-sim-*`, `webster-site-sim-*`) are additive. - -## Polish slot index (9 slots) - -| id | target | status | -| ---------------------- | ------------------------------------- | ----------- | -| `title-card` | `video/scenes/title-card/*` | not-started | -| `end-card` | `video/scenes/end-card/*` | not-started | -| `brand-title` | `video/shared/brand-title/*` | not-started | -| `council-ring` | `video/shared/council-ring/*` | not-started | -| `stat-counter` | `video/shared/stat-counter/*` | not-started | -| `synthetic-disclaimer` | `video/shared/synthetic-disclaimer/*` | not-started | -| `heatmap-overlay` | `video/shared/heatmap-overlay/*` | not-started | -| `transformation-morph` | `video/scenes/transformation/*` | not-started | -| `recovery-arc-tone` | `video/scenes/recovery-arc/*` | not-started | - -Status values: `not-started` → `baseline` → `packet-built` → `polished` → `applied`. Update inline as work progresses. - -## Operating rules (project-specific) - -- Conventional commits: `feat(video):`, `chore(video):`, `fix(video):`, `docs(video):` -- Every commit goes through husky pre-commit (type-check + lint + format). Don't bypass with `--no-verify`. -- New skill scripts must be Bun-idiomatic: `bun scripts/.ts` not Node. -- HyperFrames determinism contract: every GSAP timeline must be `{ paused: true }` and registered on `window.__timelines`. Catch missing registrations via `render-still.ts` before full render. -- Frame budget: full render ≤8 min on M-series. If exceeded, drop heatmap-overlay effect on RecoveryArc + FinalState. diff --git a/context/webster-video/STATUS.md b/context/webster-video/STATUS.md deleted file mode 100644 index ae7cc93..0000000 --- a/context/webster-video/STATUS.md +++ /dev/null @@ -1,43 +0,0 @@ -# webster-video — STATUS - -## Phase - -`polish-bundle-ready` - -## Latest action - -2026-04-26 — Layout restructured per Richie review: desktop-only (mobile screenshots dropped), full LP scrolling top-to-bottom per scene, LP centered up top, analytics + narrative panel below, week-evolution shown as paired "last week | this week" cards with explicit "Data said / Council decided / Outcome" framing. shared.css rev 2: dropped `.screenshot-card`, `.week-chip`, `.callout-chip`, `.heatmap-overlay`, `.heatmap-pass-chip`, `.council-ring*`, `.stat-counter*`. Added `.lp-pair-stage` (centered flex row, top zone), `.lp-week-block` + `.lp-week-label` + `.lp-week-label__chip` (current/dim/final variants) + `.lp-card` (860×600 viewport with overflow hidden) + `.lp-card--dim` (saturate 0.5 brightness 0.94 for failure-week visual cue) + `.lp-image` (absolute-positioned, GSAP translateY for scroll), `.narrative-panel` (bottom band 280h with 80px side padding) + `.narrative-col` + decision/outcome variants + `.narrative-eyebrow` + `.narrative-body` + `.narrative-stat` (with --small variant for two-value displays). 5 mid-scenes rewritten: before-state (w00 solo, baseline narrative), transformation (w01|w02, council's first transformation), learning-beat (w03|w04, experiment + correction), recovery-arc (w08 dim | w09, failure + recovery), final-state (w00 dim | w10, full-arc bookend). LP scroll distances hardcoded per week from native 1440-wide screenshot heights (526/2162/3362/3414/3727/3671/3599/3587 px). Linear ease across full scene minus 2s entry buffer — w00's short height creates intentional slow scroll versus dense w10's fast scroll, telling the "page grew rich over 11 weeks" story visually. Counter ramps preserved on transformation (151→343) and final-state (151→323). - -## Blockers - -(none) - -## Render history - -- **2026-04-26 — silent render**: `demo-output/videos/webster-lp-demo.silent.mp4`. 130.0s, 1920×1080, h264, 30fps, ~10 MB. `hyperframes render -q high --strict` with 6 parallel workers, ~58s wall. Compiler warned non-blocking on `var(--brand-font-utility)` (IBM Plex Sans Condensed not in HyperFrames' deterministic font cache; Google Fonts CDN renders correctly anyway — confirmed in snapshots). - -## Day log - -- **2026-04-26 — Day 0 / step 1 — tracking-set**: Plan approved at `~/.claude/plans/u-decide-whats-optimal-structured-spindle.md`. Created `context/webster-video/CONTEXT.md` (compaction-resistant) and this STATUS.md. Working branch: `feat/webster-video-skill` off `dev`. -- **2026-04-26 — Day 0 / step 2 — assets-hydrated**: Wrote `skills/webster-video/scripts/hydrate-demo-assets.ts` (Bun + Node fs APIs). Run produced 11 weeks under `demo-output/landing-page/wNN/`. Note: w00 has no `visual-review.md` (it's the baseline before any council action) — script handles as optional. Removed empty `demo-output/landing-page/week-NN/` leftovers (just `.DS_Store`). -- **2026-04-26 — Day 0 / step 3 — assets-committed**: Two commits on `feat/webster-video-skill` (off `dev`): `0fdd9c6 chore(video): add tracking dir` (CONTEXT.md + STATUS.md, 100 insertions) and `042e16b feat(video): hydrate 11 weeks of LP demo assets + add hydrate script` (103 files, 9710 insertions, ~39 MB binary). Updated `.gitignore` to ignore `local-runs/` and `audio/*.raw.mp3` on this branch. -- **2026-04-26 — Day 1 / step 1 — hyperframes-bootstrapped**: `skills` CLI installed globally. `skills add heygen-com/hyperframes` cloned the heygen-com/hyperframes repo (11k stars on github) and installed 5 sub-skills. `hyperframes init video --example blank` scaffolded the project. HyperFrames CLI exposes: init/lint/inspect/preview/render/transcribe/tts/doctor/browser. **`tts` and `transcribe` replace ElevenLabs + Whisper from the original plan** — fewer external dependencies. Visual style decision: Swiss Pulse + brand-overlay (warm cream / deep teal / leaf green / soft gold). -- **2026-04-26 — Day 1 / step 2 — script-drafted**: `video/script.md` authored — 130s narration, 7 timed beats, ~155 words at deliberate pacing. Honesty checklist included: synthetic-data qualifier on every metric callout; pitch frames as "weekly evidence loops" not "AI made a prettier landing page"; w08 framed honestly as "underperformed" without doom; w09 framed as "council corrected" without celebration. -- **2026-04-26 — Day 1 / step 3 — data-drafted**: `video/data/{metrics,brand,council}.json` + `video/lib/{easings,trim-points}.js`. metrics.json includes verbatim per-week CSV from prompt lines 78–91 + headline metric + anchor weeks. brand.json mirrors `demo-output/landing-page/brand.json` with CSS-variable map for direct injection. council.json lists the 10 sim agents (planner, monitor, seo, brand-voice, conversion, copy, fh-compliance, visual-asset-director, redesigner, visual-reviewer) with model labels for color coding. trim-points.js defines the 90s social-cut frame ranges. -- **2026-04-26 — Day 1 / step 4 — components-drafted**: `video/shared.css` — single source for brand tokens, type primitives (`brand-title--xl/lg/md`, `brand-subline`), 5 reusable component classes (`synthetic-disclaimer`, `stat-counter` + value/label/delta variants, `heatmap-overlay`, `council-ring` + agent positioning, `brand-title`), and scene primitives (`scene`, `callout-chip`, `screenshot-card`, `week-chip`, `heatmap-pass-chip`). -- **2026-04-26 — Day 1 / step 5 — scenes-drafted**: 7 sub-compositions in `video/compositions/`, master in `video/index.html` sequencing them via `data-composition-src` at 0/12/28/48/70/98/118s. Track-index conflicts resolved (mobile screenshots and second-week chips moved to higher tracks). GSAP transform conflict in learning-beat fixed via `xPercent: -50` instead of `transform: translateX(-50%)` per linter guidance. `video/assets` symlinked to `../demo-output/landing-page` so HyperFrames serves committed weekly artifacts. -- **2026-04-26 — Day 1 / step 6 — draft-rendered**: `npx hyperframes lint` returns 0/0; `npx hyperframes inspect` returns 0 layout issues across 9 timeline samples; `npx hyperframes snapshot --at 6,20,38,58,84,108,124` rendered 7 PNGs in `video/snapshots/` (gitignored). Recovery-arc transition gap at master t=84 is a ~0.2s crossfade window — acceptable for video, not a defect. Title-card, before-state, transformation, learning-beat (resampled at 58s), recovery-arc (resampled at 80s and 92s), final-state, end-card all render with full content + correct brand styling. -- **2026-04-26 — Day 2 / step 1 — polish-applied**: 8 files edited (`shared.css` full rewrite + 7 scene scripts). Polish targets from plan addressed via direct in-repo edits rather than Claude-Design slot-packet ceremony — saves 3h Day 2 budget. Confirmation-gate answers: (1) confirmed; (2) Richie records over silent render → Auphonic → mux; (3) skip-to-day-2. -- **2026-04-26 — Day 2 / step 2 — silent-rendered**: `hyperframes render -o ../demo-output/videos/webster-lp-demo.silent.mp4 -q high --strict` produced a clean 130s 1920×1080 30fps h264 mp4 in ~58s wall (6 parallel workers). 7 polished snapshots verified visuals before render: title-card mask-reveal, before-state callout overshoot stagger, transformation w01 mid-morph at frame 36s with counter at 294, learning-beat split with arrow, recovery-arc w09 + 330 + "Signal preserved", final-state with PASS chip + heatmap building, end-card with editorial close. -- **2026-04-26 — Day 2 / step 3 — mux-script-ready**: `skills/webster-video/scripts/mux-narration.ts` written (replaces planned `auphonic-process.ts` — uses ffmpeg loudnorm `-16 LUFS` + bandpass 80–12k Hz instead of Auphonic API, avoiding the API-key dependency). Detects `audio/narration.raw.{wav,mp3,m4a}`, levels → `audio/narration.mp3`, muxes silent mp4 + leveled audio → `demo-output/videos/webster-lp-demo.mp4` via stream-copy video + AAC 192k audio. `--skip-level` flag for users who pre-level via Auphonic. Uses `execFileSync` (not shell-concat) to avoid command-injection risk per security hook. -- **2026-04-26 — Day 2 / step 4 — rebuilt-paired-lps**: User feedback on first silent render: drop mobile, show full LP scrolling top-to-bottom, center the LP, place analytics below, narrate the data-said/council-decided/outcome story there, show week evolution as paired last-week-vs-current-week. shared.css rev 2 + 5 mid-scene rewrites (before-state, transformation, learning-beat, recovery-arc, final-state). title-card and end-card unchanged. Pair scheme: w00 solo / w01|w02 / w03|w04 / w08|w09 / w00|w10 (final bookends w00 against w10 for the dramatic arc recap). Scroll distances hardcoded from `sips -g pixelHeight` of each `demo-output/landing-page/wNN/desktop.png` at 860w display width. `hyperframes lint`: 0/0. 7 scene-midpoint snapshots verified visuals before re-render. -- **2026-04-26 — Day 2 / step 5 — silent-rendered (rev 2)**: `hyperframes render -q high --strict` produced `demo-output/videos/webster-lp-demo.silent.mp4` — 130.0s, 1920×1080, h264, 30fps, **71 MB** (file size grew because paired-LP layout has more pixel diversity per frame than the simpler v1 layout). 1m 36s wall with 6 parallel workers. -- **2026-04-26 — Day 2 / step 6 — polish-bundle-ready**: User chose single-shot Claude Design polish before audio. Wrote `skills/webster-video/scripts/build-slot-packets.ts` and `skills/webster-video/scripts/apply-polish-bundle.ts`. Generated 7 polish-slot packets at `skills/webster-video/polish-slots//` (title-card, before-state, transformation, learning-beat, recovery-arc, final-state, end-card). Each has `slot.json` (master frame range, dimensions), `brand.tokens.json` (palette + typography + motion + honesty constraints), `baseline.html` (current scene), `baseline.css` (full shared.css), `baseline.png` (mid-scene snapshot), `acceptance.md` (what "done" looks like), `DO_NOT_TOUCH.md` (locked text/numbers/durations/honesty framing/HyperFrames contract). Plus top-level `README.md` (workflow) and `PROMPT.md` (claude.ai/design system prompt). Bundle zipped to `skills/webster-video/polish-slots.zip` (1.3 MB) — gitignored. `.gitignore` updated to ignore `**/handoff/` and the zip. `apply-polish-bundle.ts` reads `/handoff/index.html` (full replacement of scene HTML) and optional `handoff-shared/shared.css` (full replacement of shared.css), runs `hyperframes lint`, prints next steps for re-render. - -## Pending - -- **Claude Design polish (single-shot, in flight)**: upload `skills/webster-video/polish-slots.zip` (1.3 MB) to **claude.ai/design** with `polish-slots/PROMPT.md` as the system prompt. Iterate per slot. Save returned bundles to `skills/webster-video/polish-slots//handoff/index.html` (and optional `handoff-shared/shared.css`). Run `bun skills/webster-video/scripts/apply-polish-bundle.ts` to integrate. Re-render after. -- After polish: Richie records narration over the polished silent mp4 → `audio/narration.raw.{wav,mp3,m4a}` -- Run `bun skills/webster-video/scripts/mux-narration.ts` (auto-levels via ffmpeg loudnorm + muxes → `demo-output/videos/webster-lp-demo.mp4`) -- Optional: Auphonic-grade audio polish (manual upload at auphonic.com → save to `audio/narration.mp3` → re-run with `--skip-level`) -- Optional: captions via `hyperframes transcribe audio/narration.mp3` + ffmpeg subtitle burn-in diff --git a/demo-output/landing-page/INDEX.md b/demo-output/landing-page/INDEX.md new file mode 100644 index 0000000..6e9975e --- /dev/null +++ b/demo-output/landing-page/INDEX.md @@ -0,0 +1,49 @@ +# Webster LP Timelapse — 11-Week Walk-through + +The Richer Health landing page, audited and improved week-by-week by a 9-agent simulation council. Each row links to that week's deliverables: full-page screenshots at three breakpoints, a synthetic heatmap (JSON + SVG), the synthetic-analytics snapshot the monitor saw, and the visual reviewer's markdown verdict. + +`w00` is the deliberately-ugly starting state. `w01` is the council's first principled intervention. `w10` is the terminal polish. + +## How to read this + +- **Beat** — the council's classification of that week's intent (broad redesign vs. narrow fix vs. validation, etc.). +- **Visual review** — what the rendered page actually looked like, written by the simulation visual-reviewer agent against the proposal it was handed. PASS / BLOCK / fix-hints. +- **Verdict** — every committed week is a PASS at the visual gate. If a week had been vetoed it would not have shipped its artifacts; the absence of a missing week is the gate working as intended. +- **Substrate** — `agents/simulation/webster-lp-sim-*` is the 1:1 simulation mirror of `agents/production/`. Production is what runs on Nicolette's live council; simulation is what produced the artifacts below. They share schema and prompt structure. + +## The 11 weeks + +| Week | Beat | Desktop | Visual review | +| --- | --- | --- | --- | +| **[w00](w00/)** | Ugly baseline. Mock starting state, no council intervention yet. | [`desktop.png`](w00/desktop.png) | _no review — pre-council_ | +| **[w01](w01/)** | `BROAD_REDESIGN` — first principled intervention from the ugly baseline. Founder face above the fold, single-offer hero, lineage card framed as educational provenance not clinical efficacy, four-logo press strip. | [`desktop.png`](w01/desktop.png) | [`visual-review.md`](w01/visual-review.md) | +| **[w02](w02/)** | `NARROW_FIX_DEEP` — focused inside-the-frame fixes on top of w01's structure. | [`desktop.png`](w02/desktop.png) | [`visual-review.md`](w02/visual-review.md) | +| **[w03](w03/)** | Mid-arc iteration. Critics + redesigner converge on the proof / services balance. | [`desktop.png`](w03/desktop.png) | [`visual-review.md`](w03/visual-review.md) | +| **[w04](w04/)** | `VALIDATION + UPSTREAM_NARROW_FIX` — largest visual-review file in the arc. The council validates prior weeks and applies a targeted upstream correction. | [`desktop.png`](w04/desktop.png) | [`visual-review.md`](w04/visual-review.md) | +| **[w05](w05/)** | Hero-copy iteration. H1 rewritten to practitioner register; in-hero disclaimer; credential strip retired in favor of a single governing-body prose line. | [`desktop.png`](w05/desktop.png) | [`visual-review.md`](w05/visual-review.md) | +| **[w06](w06/)** | Services → Contact seam repair. The council closes a continuity gap between the program block and the discovery form. | [`desktop.png`](w06/desktop.png) | [`visual-review.md`](w06/visual-review.md) | +| **[w07](w07/)** | Contact-section narrow fix. Anchor week w06 frozen byte-for-byte upstream; only the contact section moves. | [`desktop.png`](w07/desktop.png) | [`visual-review.md`](w07/visual-review.md) | +| **[w08](w08/)** | `PRESS_STRIP_TRUST_ANCHOR_IN_CONTACT_FRAME` — relocates the press-strip trust signal into the contact frame so it sits right next to the action. | [`desktop.png`](w08/desktop.png) | [`visual-review.md`](w08/visual-review.md) | +| **[w09](w09/)** | `DISCOVERY_FORM_FIELD_FRICTION_DROP_CLINIC_SIZE` — drops a friction field from the discovery form (clinic_size); falsification-path beat. | [`desktop.png`](w09/desktop.png) | [`visual-review.md`](w09/visual-review.md) | +| **[w10](w10/)** | Terminal-state polish. Two surgical text patches: a hero credential-line claim softener and a contact-lede field-count cleanup. Page geometry byte-stable from w09. | [`desktop.png`](w10/desktop.png) | [`visual-review.md`](w10/visual-review.md) | + +## Per-week file inventory + +Every week directory contains: + +- `desktop.png`, `tablet.png`, `mobile.png` — full-page screenshots at 1440 / 768 / 375. +- `desktop-heatmap.svg`, `tablet-heatmap.svg`, `mobile-heatmap.svg` — synthetic heatmaps overlaid on the rendered layout. +- `heatmap.json` — the structured layout-and-reach measurements the heatmap is rendered from. +- `analytics.json` — synthetic 5000-user-panel analytics for the week (mock, framed as POC priors per the project's "no fabricated stats" rule). +- `visual-review.md` — the simulation visual-reviewer's verdict. Present from w01 onward; absent on w00 (no proposal to review). + +## What this proves + +The arc is the council closing on the brand bible, not anchoring to the ugly baseline. The first three weeks broaden, the middle weeks validate and refine, and the last two weeks land surgical text-only edits that confirm the visual frame is stable. Every week passes the visual gate; every week's screenshots and heatmap are reproducible from the committed assets. + +## Adjacent evidence + +- [`assets/memory-stores-screenshots/`](../../assets/memory-stores-screenshots/) — Anthropic Managed Memory Stores Console captures showing six per-substrate stores filling over the run. +- [`agents/production/`](../../agents/production/) — the 9 pre-registered Managed Agent specs that power Nicolette's live council on `main`. +- [`agents/simulation/`](../../agents/simulation/) — the 9 simulation specs that produced this LP run. +- [`scripts/critic-genealogy.ts`](../../scripts/critic-genealogy.ts) — the runtime specialist-spawn hero, called from inside the orchestrator when no existing critic owns a detected gap. diff --git a/prompts/video-composition-session.md b/prompts/video-composition-session.md deleted file mode 100644 index a1b9942..0000000 --- a/prompts/video-composition-session.md +++ /dev/null @@ -1,344 +0,0 @@ -# Webster video composition session - -Use this prompt to start a fresh ChatGPT / coding-agent session whose only job is to turn the accepted local LP council run into a polished demo video using FFmpeg and, if feasible, a Forge/Remotion workflow. - -## Role - -You are a senior motion designer + build engineer. Create a high-quality, judge-facing Webster demo video from existing local artifacts. Prioritize a crisp story over exhaustive completeness. - -## Repo - -```text -/Users/richiesakhon/Projects/webster -``` - -Start by reading: - -```text -AGENTS.md -context/ARCHITECTURE.md -context/FEATURES.md -context/VISION.md -context/QUALITY-GATES.md -TODOS.md -``` - -Then inspect the accepted run: - -```text -local-runs/lp-council/w01-single-offer-visual-heatmap -``` - -Key artifact pattern: - -```text -local-runs/lp-council/w01-single-offer-visual-heatmap/screenshots/wNN/desktop.png -local-runs/lp-council/w01-single-offer-visual-heatmap/screenshots/wNN/mobile.png -local-runs/lp-council/w01-single-offer-visual-heatmap/screenshots/wNN/desktop-heatmap.svg -local-runs/lp-council/w01-single-offer-visual-heatmap/history/wNN/analytics.json -local-runs/lp-council/w01-single-offer-visual-heatmap/history/wNN/heatmap.json -local-runs/lp-council/w01-single-offer-visual-heatmap/history/wNN/visual-review.md -``` - -Weeks available: `w00` through `w10`. - -## Core story - -Do **not** pitch Webster as “AI made a prettier landing page.” Pitch it as: - -> Webster is a weekly landing-page council. It uses specialist agents, visual review, synthetic analytics, and heatmaps to run narrow experiments, detect regressions, and keep improving without turning the page into a generic AI mess. - -The strongest video arc is: - -```text -w00: bad generic wellness template -w01: single-offer premium institute baseline -w04: strongest metric week after learning from booking friction -w08: failed trust-anchor experiment -w09: council diagnoses failure and reduces form friction -w10: final polish -``` - -Use all weeks for a fast timelapse/morph if useful, but narrate only the anchor weeks above. - -## Required honesty labels - -Every chart, heatmap, and metric callout must label the data as synthetic: - -```text -Synthetic 5,000-user demo panel -Mock analytics, not real visitor data -Synthetic heatmap from DOM layout + mocked engagement -``` - -Never imply the analytics are real visitor analytics. - -## Metric table to use - -```csv -week,sessions,bounce,avg_time,scroll75,scroll100,cta,contact_views,contact_dropoff,mobile_h,desktop_h -w00,5044,0.630,92.8,0.388,0.181,151,1269,0.880,2760,1886 -w01,5044,0.473,119.8,0.470,0.217,314,1808,0.751,6136,4626 -w02,5046,0.469,137.4,0.470,0.219,343,1822,0.728,9388,6637 -w03,5045,0.478,139.6,0.478,0.219,331,1791,0.738,9628,6723 -w04,5048,0.467,140.3,0.481,0.219,344,1830,0.727,10118,7249 -w05,5047,0.469,139.8,0.474,0.220,332,1822,0.737,10107,7288 -w06,5049,0.466,139.1,0.481,0.220,333,1833,0.736,10453,7550 -w07,5048,0.478,130.5,0.476,0.221,331,1792,0.738,9748,7042 -w08,5038,0.481,136.8,0.467,0.221,327,1778,0.740,9918,7156 -w09,5037,0.475,130.6,0.469,0.221,330,1798,0.738,9796,7034 -w10,4932,0.477,138.6,0.468,0.220,323,1754,0.738,9775,7012 -``` - -Headline claim, carefully labeled: - -```text -Synthetic discovery-call intent: 151 → 323 clicks, 2.1× after 10 simulated weekly passes. -``` - -Also mention: - -```text -Bounce: 0.630 → 0.477 -Avg time: 92.8s → 138.6s -``` - -## Video target - -Create a 90–150 second high-quality mock demo video. Default to 120 seconds. - -Visual style: - -- premium clinical / editorial -- warm cream background -- forest green + soft gold accents -- subtle shadows, clean typography -- no neon SaaS gradients -- no fake browser dashboards that imply real traffic -- make labels large enough for judges to read in a compressed video player - -## Suggested storyboard - -### 0–12s — title / premise - -Visual: Webster wordmark or clean text card over a blurred w00→w10 strip. - -Copy: - -```text -Webster turns landing pages into weekly evidence loops. -``` - -Small label: - -```text -Local demo run · synthetic analytics -``` - -### 12–28s — before - -Show w00 desktop and mobile. - -Callouts: - -```text -Generic wellness template -Weak offer clarity -151 synthetic discovery-call clicks -``` - -### 28–48s — first council transformation - -Morph w00 → w01 → w02. - -Callouts: - -```text -Single-offer focus -Founder-led clinical trust -CTA clicks 151 → 343 by w02 -``` - -### 48–70s — learning beat - -Show w03 booking-slot experiment, then w04 correction. - -Callouts: - -```text -w03 tested booking friction -w04 moved trust upstream -Best week: 344 synthetic CTA clicks -``` - -### 70–98s — failed experiment / recovery - -Show w08 press-strip trust anchor and w09 form-friction fix. - -Callouts: - -```text -A clean-looking trust anchor underperformed. -The next council stopped polishing logos and removed form friction. -``` - -Overlay small metric deltas: - -```text -w08 CTA: 327 -w09 CTA: 330 -``` - -### 98–118s — final state - -Show w10 desktop, mobile, and heatmap overlay. - -Callouts: - -```text -PASS visual review -No horizontal overflow -3-field discovery form -``` - -### 118–130s — close - -Final card: - -```text -A landing page that learns every week. -``` - -Subline: - -```text -Specialist agents · visual review · synthetic heatmaps · narrow experiments -``` - -## Implementation paths - -Produce whichever path is fastest and cleanest. Prefer Remotion for the final render if setup is manageable; use FFmpeg for preprocessing, contact sheets, timelapse segments, and fallback assembly. - -### Path A — FFmpeg-first fallback - -Create `demo-output/videos/webster-lp-demo.mp4` directly with FFmpeg. - -Recommended intermediate assets: - -```text -demo-output/videos/frames/ -demo-output/videos/clips/ -demo-output/videos/audio/ # optional, can be silent for mock -``` - -Useful FFmpeg operations: - -1. Normalize screenshots to 16:9 canvases with blurred background. -2. Generate scroll/pan clips over tall screenshots using `crop` expressions. -3. Generate crossfades between anchor weeks. -4. Overlay text labels with `drawtext` or pre-rendered PNG/SVG title cards. -5. Use heatmap SVGs by converting to PNG first if needed: - -```bash -qlmanage -t -s 1440 -o demo-output/videos/frames path/to/desktop-heatmap.svg -# or use rsvg-convert / magick if installed -``` - -Example FFmpeg pan idea: - -```bash -ffmpeg -y -loop 1 -i input.png -t 6 \ - -vf "scale=1920:-1,crop=1920:1080:0:'min((ih-1080), t/6*(ih-1080))',fps=30" \ - -pix_fmt yuv420p clip.mp4 -``` - -Adapt this carefully per image height. - -### Path B — Remotion / Forge-preferred - -If Remotion is present or quick to add, create a small Remotion composition under a clearly named demo-video folder, e.g.: - -```text -video/webster-lp-demo/ -``` - -Or if the repo already has Remotion conventions, follow them. - -Composition requirements: - -- 1920×1080 -- 30fps -- 120–150 seconds -- deterministic asset paths -- no network fetches -- renders to: - -```text -demo-output/videos/webster-lp-demo.mp4 -``` - -Recommended Remotion components: - -```text -TitleCard -ScreenshotPan -WeekMetricCard -HeatmapOverlay -FailureRecoveryBeat -FinalSummaryCard -``` - -Use local screenshots and SVG heatmaps as static assets. Do not pull live URLs. - -### Forge workflow expectation - -If Forge is available, create/use a Forge task/workflow that: - -1. scaffolds the video composition -2. writes the Remotion components -3. renders a draft MP4 -4. runs a quick verification script -5. outputs a short report with file paths and any missing assets - -Keep the scope limited to video composition. Do not modify the LP council runner, production agents, or live orchestrator. - -## Verification checklist - -Before declaring done: - -```bash -bun run validate -``` - -Also verify: - -```text -- demo-output/videos/webster-lp-demo.mp4 exists -- video duration is between 90 and 150 seconds -- video resolution is 1920x1080 or 1280x720 minimum -- every analytics chart says synthetic/mock -- no live URL or real visitor claim appears -- w08 failure and w09 correction are clearly shown -- w10 final PASS/no-overflow is visible -``` - -Use `ffprobe`: - -```bash -ffprobe -v error -show_entries format=duration -show_streams demo-output/videos/webster-lp-demo.mp4 -``` - -## Deliverables - -At the end, report: - -```text -- final video path -- duration/resolution -- key source assets used -- commands run -- validation result -- remaining risks -``` - -Do not bury failures. If Remotion setup takes too long, switch to FFmpeg fallback and ship a clean mock video. diff --git a/scripts/watch-dispatches.sh b/scripts/watch-dispatches.sh deleted file mode 100755 index da8a0db..0000000 --- a/scripts/watch-dispatches.sh +++ /dev/null @@ -1,170 +0,0 @@ -#!/bin/bash -# Watcher for webster-ralph-dag dispatches. -# Polls each dispatch log for `dag_workflow_finished`, then invokes the dispatcher -# via `claude -p` with a trigger message so it can pick up FEATURES.md updates -# and fire the next batch. -# -# Usage: -# scripts/watch-dispatches.sh # uses default v5 slugs -# scripts/watch-dispatches.sh slug1 slug2 slug3 ... # explicit slugs to watch -# -# Each explicit slug is treated as both the feature-# (unknown; reports as "?") -# and the branch (feat/). Default mode hard-codes the v5 metadata. -# -# Runs forever until all watched dispatches report `dag_workflow_finished`. -# Backgrounded via nohup by the dispatcher. Safe to run multiple watchers for -# disjoint slug sets; the lockfile is keyed by slug-set hash. - -set -eu -set -o pipefail - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -WEBSTER_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" -DISPATCHER_SETTINGS="$WEBSTER_ROOT/.claude/dispatcher-settings.json" -DISPATCHER_PROMPT="$WEBSTER_ROOT/.claude/dispatcher.md" -LOG_DIR="$WEBSTER_ROOT/tmp/logs" -WATCHER_LOG_DIR="$WEBSTER_ROOT/tmp/watcher" -POLL_SECONDS=30 - -mkdir -p "$WATCHER_LOG_DIR" - -# Dispatch table format: slug|feature|branch, one per line. -if [ "$#" -gt 0 ]; then - DISPATCHES="" - for slug in "$@"; do - DISPATCHES="${DISPATCHES}${slug}|?|feat/${slug}\n" - done - DISPATCHES=$(printf "%b" "$DISPATCHES") -else - DISPATCHES="planner-agent-spec-v5|#50|feat/planner-agent-spec-v5 -apply-worker-cli-v5|#39a|feat/apply-worker-cli-v5 -seed-demo-arc-w3w4-v5|#57|feat/seed-demo-arc-w3w4-v5" -fi - -SLUG_HASH=$(printf "%s" "$DISPATCHES" | shasum | cut -c1-8) -LOCKFILE="$WATCHER_LOG_DIR/watcher-${SLUG_HASH}.lock" -COMPLETED_FILE="$WATCHER_LOG_DIR/watcher-${SLUG_HASH}-completed.txt" -WATCHER_LOG="$WATCHER_LOG_DIR/watcher-${SLUG_HASH}.log" - -log() { - echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] $*" | tee -a "$WATCHER_LOG" >&2 -} - -cleanup() { - if [ -r "$LOCKFILE" ] && [ "$(cat "$LOCKFILE")" = "$$" ]; then - rm -f "$LOCKFILE" - log "watcher exiting, lockfile cleaned" - fi -} -trap cleanup EXIT INT TERM - -acquire_lock() { - if (set -C; echo "$$" > "$LOCKFILE") 2>/dev/null; then - return 0 - fi - - if [ -r "$LOCKFILE" ]; then - existing_pid=$(cat "$LOCKFILE") - if [ -n "$existing_pid" ] && kill -0 "$existing_pid" 2>/dev/null; then - echo "watcher already running for this slug set (pid=$existing_pid, lockfile=$LOCKFILE)" >&2 - exit 1 - fi - log "stale lockfile found (pid=$existing_pid), removing" - rm -f "$LOCKFILE" - fi - - if ! (set -C; echo "$$" > "$LOCKFILE") 2>/dev/null; then - echo "watcher lock acquisition raced for this slug set (lockfile=$LOCKFILE)" >&2 - exit 1 - fi -} - -acquire_lock - -touch "$COMPLETED_FILE" -touch "$WATCHER_LOG" - -log "watcher started, pid=$$, slug-hash=$SLUG_HASH" -log "watching:" -printf "%s\n" "$DISPATCHES" | while IFS='|' read -r slug feature branch; do - log " $slug ($feature) on $branch" -done - -escape_osascript_string() { - printf "%s" "$1" | sed 's/\\/\\\\/g; s/"/\\"/g' -} - -notify_osx() { - local title="$1" - local body="$2" - local escaped_title - local escaped_body - escaped_title=$(escape_osascript_string "$title") - escaped_body=$(escape_osascript_string "$body") - osascript -e "display notification \"$escaped_body\" with title \"$escaped_title\"" 2>/dev/null || true -} - -ping_dispatcher() { - local slug="$1" - local feature="$2" - local branch="$3" - local status="$4" - - local response_log - response_log="$WATCHER_LOG_DIR/response-${slug}-$(date +%s).log" - local msg - msg="watcher: dispatch ${feature} on ${branch} finished (status=${status}). Pick up this completion: read .claude/checkpoints/ for prior state, inspect forge isolation list for merge state, update context/FEATURES.md row (todo→in-progress→done/blocked), and if queue has room (<=3 concurrent), dispatch the next L11 feature per the dispatcher rules. Watcher log: $WATCHER_LOG. Workflow log: $LOG_DIR/${slug}.log." - - log "pinging dispatcher for $slug (status=$status) -> $response_log" - notify_osx "Webster dispatch: $feature" "$branch finished ($status). Spawning dispatcher." - - cd "$WEBSTER_ROOT" || return 1 - claude \ - -p "$msg" \ - --dangerously-skip-permissions \ - --model claude-opus-4-7 \ - --settings "$DISPATCHER_SETTINGS" \ - --system-prompt "$(cat "$DISPATCHER_PROMPT")" \ - > "$response_log" 2>&1 || log "WARN: claude -p exited non-zero for $slug (see $response_log)" - - log "dispatcher run completed for $slug" - notify_osx "Webster dispatcher" "Finished pass for $feature ($branch)." -} - -while true; do - remaining=0 - while IFS='|' read -r slug feature branch; do - [ -z "$slug" ] && continue - logfile="$LOG_DIR/${slug}.log" - - if grep -qxF "$slug" "$COMPLETED_FILE" 2>/dev/null; then - continue - fi - - remaining=$((remaining + 1)) - - [ -f "$logfile" ] || continue - grep -q 'dag_workflow_finished' "$logfile" || continue - - if grep -q '"anyFailed":true' "$logfile"; then - status="failed" - elif grep -q '"anyCompleted":true' "$logfile"; then - status="success" - else - status="unknown" - fi - - log "detected completion: $slug status=$status" - echo "$slug" >> "$COMPLETED_FILE" - - ping_dispatcher "$slug" "$feature" "$branch" "$status" - done <<< "$DISPATCHES" - - if [ "$remaining" -eq 0 ]; then - log "all ${SLUG_HASH} dispatches handled, watcher done" - notify_osx "Webster watcher" "All dispatches complete — watcher done." - break - fi - - sleep "$POLL_SECONDS" -done diff --git a/skills/webster-onboarding/SKILL.md b/skills/webster-onboarding/SKILL.md index 47197d0..d8cb539 100644 --- a/skills/webster-onboarding/SKILL.md +++ b/skills/webster-onboarding/SKILL.md @@ -250,10 +250,9 @@ EOF ## Reference files - `references/qa-bank.md` — full Q&A flow, dynamic-fill rules, URL-scrape failure modes -- `references/business-yaml-schema.md` — `business.yaml` schema with Empire example as template +- `references/business-yaml-schema.md` — `business.yaml` schema with a worked example - `references/key-handling.md` — locked disclaimer text, `.env.local` writing rules, verification flow - `references/remediation.md` — per-gate failure → remediation hint mapping -- `references/empire-fixture.md` — Empire Asphalt Paving brand identity for hackathon case-study video ## Companion scripts (in `scripts/onboarding/`) diff --git a/skills/webster-onboarding/references/empire-fixture.md b/skills/webster-onboarding/references/empire-fixture.md deleted file mode 100644 index cf1ed86..0000000 --- a/skills/webster-onboarding/references/empire-fixture.md +++ /dev/null @@ -1,59 +0,0 @@ -# Empire Asphalt Paving — hackathon case-study fixture - -Reference data for the 90-second case-study video showing Webster installation on Richie's dad's paving business. Used by the recording, not by the skill at runtime. - -The full storyboard, recording surface, and pre-recording checklist live in `context/ONBOARDING-CASE-STUDY.md`. This file is the brand-data shape used to dry-run the skill before recording. - -## Brand identity - -| field | value | -| ----------------- | ---------------------------------------------------------------------------------- | -| primary color | royal blue `#1B47A1` (from logo) | -| accent color | bright yellow `#F9D71C` (from logo) | -| voice register | warm-direct, premium-handcraft, family-business | -| reading level | 8th–9th grade | -| pronouns | "we" | -| do-not-use copy | "industry-leading", "innovative solutions", emoji, "synergy", hyperbole adjectives | -| do-not-use visual | stock-photo CGI trucks, cartoon icons, saturated primaries beyond brand colors | -| trust signals | 18 years, family-owned, fully insured, real past-job photos | - -## Brand corpus directory (must exist before recording) - -```text -context/brand-corpus/ -├── logo.png ← royal blue circle + yellow crown + cursive "e" -├── business-card.jpg ← real if available; mock if not (consistent with logo palette) -├── past-jobs/ -│ ├── job-1.jpg ← driveway paving -│ ├── job-2.jpg ← parking lot resurfacing -│ └── job-3.jpg ← patch repair / sealcoat -├── service-list.md ← typed by Richie from dad's known services -├── reviews.md ← 2-3 paraphrased real reviews; star count, name, year -└── voice-notes.md ← Richie-paraphrased dad quotes capturing voice tone -``` - -## `business.yaml` (full Empire example) - -See `references/business-yaml-schema.md` "Empire Asphalt Paving — example" section for the canonical YAML. - -## Pre-recording checklist (subset relevant to the skill) - -- [ ] `webster-onboarding` skill exists at this path with the P0–P5 phase model -- [ ] `bun run onboarding:verify-all` passes against a fresh test environment -- [ ] Empire's ugly v0 HTML committed to a fresh GitHub repo (e.g., `richsak/empire-paving-demo`) — handled by T13, not by the skill -- [ ] `context/brand-corpus/` populated with all corpus files for Empire -- [ ] Anthropic API key has Memory Stores + Managed Agents quota -- [ ] Cloudflare API token + GitHub PAT in `.env.local` -- [ ] One dry-run install completed successfully end-to-end before live take - -## Why this fixture matters - -The Empire case study is the second of three real humans in the demo chain (Nicolette in Beat 1, Dad here, Richie as builder/operator). It supplements the main 3-minute demo with a 90s "what does this look like for a real non-technical owner" view. The skill must produce the same end state Richie shows in the video — drift between the recording and what the skill actually does kills credibility. - -When dry-running the skill before recording, use this Empire fixture as the input. After the dry-run, confirm: - -1. `context/business.yaml` matches the schema example -2. `context/brand-corpus/` has all 7 expected files -3. `bun run onboarding:verify-all --phase p4` exits 0 -4. The first council session opens a PR against `richsak/empire-paving-demo` with a week-1 redesign -5. `context/onboarding-status.json` reads `phase: DONE` diff --git a/skills/webster-onboarding/references/qa-bank.md b/skills/webster-onboarding/references/qa-bank.md index 082273c..7be3a1f 100644 --- a/skills/webster-onboarding/references/qa-bank.md +++ b/skills/webster-onboarding/references/qa-bank.md @@ -47,7 +47,7 @@ For each file: 4. Reference paths in `business.yaml` under `corpus:` array — never base64-inline file contents 5. If it's a text-bearing file (`md`, `txt`, `csv`, OCR-readable `pdf`/`jpg`), extract the text and use it to auto-fill `business.yaml` fields per the same mapping as Source 1 -Recommended brand corpus layout (from `context/ONBOARDING-CASE-STUDY.md`): +Recommended brand corpus layout: ```text context/brand-corpus/ diff --git a/skills/webster-video/polish-slots/PROMPT.md b/skills/webster-video/polish-slots/PROMPT.md deleted file mode 100644 index 70c8580..0000000 --- a/skills/webster-video/polish-slots/PROMPT.md +++ /dev/null @@ -1,37 +0,0 @@ -# Webster — video polish brief - -You're designing the demo video for a hackathon submission ("Built with Opus 4.7", Anthropic). - -## Absolute constraint - -**Total video duration must be ≤ 2 minutes 30 seconds (150 seconds).** - -Everything else is your call. Layout, scene count, scene durations, copy, typography, motion language, color treatment, visual hierarchy, motion choreography — your judgment. - -## What's in the bundle - -7 scene directories under `polish-slots//`. Each is the current draft as reference material, not a spec to follow. - -Per slot: - -- `baseline.html` — current scene markup -- `baseline.css` — current shared CSS (visual context for color/typography decisions) -- `baseline.png` — frame from the current rendered scene -- `brand.tokens.json` — palette + typography + motion easings -- `slot.json` — scene metadata (current duration, frame range) - -## Source material - -The video tells the story of an AI council that iteratively improved a landing page across 11 weekly passes. The 7 baseline scenes show the current draft of that story. - -## Output format - -The render engine is HyperFrames (HTML/CSS/JS + GSAP, deterministic 1920×1080 @ 30fps). For each scene you polish, save: - -- `polish-slots//handoff/index.html` — full self-contained scene HTML with inline ` - - -
-
- -
- -
- -
- -
- -
- -
-
- - - - diff --git a/video/lib/easings.js b/video/lib/easings.js deleted file mode 100644 index 891fed8..0000000 --- a/video/lib/easings.js +++ /dev/null @@ -1,11 +0,0 @@ -/* global window */ -/* Webster — GSAP easing presets shared across scenes */ -window.WEBSTER_EASING = window.WEBSTER_EASING || { - enter: "power2.out", - exit: "power2.in", - emphasis: "expo.out", - whip: "power4.in", - morph: "power3.inOut", - ramp: "circ.out", - drift: "sine.inOut", -}; diff --git a/video/lib/trim-points.js b/video/lib/trim-points.js deleted file mode 100644 index 20fd87d..0000000 --- a/video/lib/trim-points.js +++ /dev/null @@ -1,37 +0,0 @@ -/* global window */ -/* Webster — 90-second social cut trim points - * - * The full 130s storyboard has 7 beats. The 90s social variant drops the - * "learning beat" (48-70s) and the "recovery arc" (70-98s) to keep only the - * before / first-transformation / final-state / close arc. Re-time afterward. - * - * Frame counts assume 30fps per hyperframes default. Frame ranges are - * [start, end) inclusive of start, exclusive of end. - */ -window.WEBSTER_TRIM = window.WEBSTER_TRIM || { - full_130s: { - fps: 30, - total_frames: 3900, - beats: [ - { id: "title-card", start: 0, end: 360 }, - { id: "before-state", start: 360, end: 840 }, - { id: "transformation", start: 840, end: 1440 }, - { id: "learning-beat", start: 1440, end: 2100 }, - { id: "recovery-arc", start: 2100, end: 2940 }, - { id: "final-state", start: 2940, end: 3540 }, - { id: "end-card", start: 3540, end: 3900 }, - ], - }, - social_90s: { - fps: 30, - total_frames: 2700, - drops: ["learning-beat", "recovery-arc"], - beats: [ - { id: "title-card", start: 0, end: 360 }, - { id: "before-state", start: 360, end: 840 }, - { id: "transformation", start: 840, end: 1440 }, - { id: "final-state", start: 1440, end: 2340 }, - { id: "end-card", start: 2340, end: 2700 }, - ], - }, -}; diff --git a/video/meta.json b/video/meta.json deleted file mode 100644 index 4534f67..0000000 --- a/video/meta.json +++ /dev/null @@ -1,5 +0,0 @@ -{ - "id": "video", - "name": "video", - "createdAt": "2026-04-26T13:04:52.799Z" -} diff --git a/video/script.md b/video/script.md deleted file mode 100644 index 9c6451c..0000000 --- a/video/script.md +++ /dev/null @@ -1,83 +0,0 @@ -# Webster — 130s narration - -> Source storyboard: `prompts/video-composition-session.md` lines 120–216. -> Voice tokens: warm-authoritative, evidence-aware, professionally provocative — per `demo-output/landing-page/brand.json`. -> Honesty rule: never imply real visitor analytics. Headline metric must include the "synthetic" qualifier. -> Pitch rule: never frame as "AI made a prettier landing page" (per prompt line 47). Frame as a council that runs weekly evidence loops. - -## Beat-by-beat (timed) - -### Beat 1 — Title / premise (0–12s) - -- **Visual**: blurred w00→w10 strip beneath the wordmark; `` fade-in; `` appears bottom-right. -- **On-screen**: `Webster turns landing pages into weekly evidence loops.` -- **Sub-label**: `Local demo run · synthetic analytics` -- **VO**: "Webster turns landing pages into weekly evidence loops." - -### Beat 2 — Before (12–28s) - -- **Visual**: w00 desktop + mobile at scale; callout chips fade in. -- **Callouts**: `Generic wellness template` · `Weak offer clarity` · `151 synthetic discovery-call clicks` -- **VO**: "Week zero. A generic wellness template. Weak offer clarity. A hundred and fifty-one synthetic discovery-call clicks." - -### Beat 3 — First council transformation (28–48s) - -- **Visual**: w00 → w01 → w02 morph; council ring overlays briefly; CTA-clicks counter ramps 151→343. -- **Callouts**: `Single-offer focus` · `Founder-led clinical trust` · `CTA 151 → 343 by w02` -- **VO**: "The council ran. Three weeks later, the page found a single offer, founder-led clinical trust, and a credible voice. Synthetic CTA clicks more than doubled — three hundred forty-three by week two." - -### Beat 4 — Learning beat (48–70s) - -- **Visual**: w03 booking-slot test on the left, w04 correction on the right; small delta annotations. -- **Callouts**: `w03 tested booking friction` · `w04 moved trust upstream` · `Best week: 344 synthetic CTA clicks` -- **VO**: "Week three tested a booking-friction experiment. The data said no. Week four moved trust upstream — three hundred forty-four clicks, the strongest week of the run." - -### Beat 5 — Failed experiment / recovery (70–98s) - -- **Visual**: w08 press-strip trust anchor (slightly desaturated) → w09 form-friction fix; metric deltas overlay. -- **Callouts**: `A clean-looking trust anchor underperformed.` · `The next council stopped polishing logos and removed form friction.` · `w08 CTA: 327 · w09 CTA: 330` -- **VO**: "Then a clean-looking trust anchor underperformed. Week eight, three hundred twenty-seven. The next council stopped polishing logos and removed form friction. Week nine, three hundred thirty. Signal preserved." - -### Beat 6 — Final state (98–118s) - -- **Visual**: w10 desktop and mobile side-by-side; heatmap overlay fades in; PASS verdict chip. -- **Callouts**: `PASS visual review` · `No horizontal overflow` · `3-field discovery form` -- **VO**: "Week ten. PASS visual review. A three-field discovery form. No horizontal overflow. The heatmap maps synthetic engagement onto real surfaces." - -### Beat 7 — Close (118–130s) - -- **Visual**: end-card with Webster wordmark, "Built with Opus 4.7", and URL. -- **On-screen**: `A landing page that learns every week.` -- **Sub-label**: `Specialist agents · visual review · synthetic heatmaps · narrow experiments` -- **VO**: "A landing page that learns every week. Specialist agents. Visual review. Synthetic heatmaps. Narrow experiments." - -## Plain narration (for `npx hyperframes tts`) - -Webster turns landing pages into weekly evidence loops. - -Week zero. A generic wellness template. Weak offer clarity. A hundred and fifty-one synthetic discovery-call clicks. - -The council ran. Three weeks later, the page found a single offer, founder-led clinical trust, and a credible voice. Synthetic CTA clicks more than doubled — three hundred forty-three by week two. - -Week three tested a booking-friction experiment. The data said no. Week four moved trust upstream — three hundred forty-four clicks, the strongest week of the run. - -Then a clean-looking trust anchor underperformed. Week eight, three hundred twenty-seven. The next council stopped polishing logos and removed form friction. Week nine, three hundred thirty. Signal preserved. - -Week ten. PASS visual review. A three-field discovery form. No horizontal overflow. The heatmap maps synthetic engagement onto real surfaces. - -A landing page that learns every week. Specialist agents. Visual review. Synthetic heatmaps. Narrow experiments. - -## Pacing notes - -- Total ~155 words at ~71 words/minute (deliberate pacing) ≈ 130s. -- Add ~0.5s pause at every paragraph break; ~0.25s at every period within a paragraph. -- Voice direction for ElevenLabs/Auphonic prompt: "warm-authoritative; trusted mentor not vendor; clinically credible without sterile; professionally provocative." - -## Honesty checklist - -- ☑ "synthetic" appears in beats 1, 2, 3, 5, 6, 7 -- ☑ No claim of real-traffic analytics -- ☑ w08 framed as "underperformed" (not catastrophic; not minimized) -- ☑ w09 framed as "council corrected" (not victory celebration) -- ☑ No "AI made a prettier landing page" framing — uses "council ran", "specialist agents", "narrow experiments" -- ☑ Headline metric stated with "synthetic" qualifier (Beat 3) diff --git a/video/shared.css b/video/shared.css deleted file mode 100644 index 6e2070a..0000000 --- a/video/shared.css +++ /dev/null @@ -1,308 +0,0 @@ -/* Webster — shared brand + component styles - * - * Single source of truth for visual primitives across all scenes. - * Restructured 2026-04-26 (rev 2): paired scrolling LPs centered up top, - * data/council/outcome narrative panel below. Mobile screenshots dropped. - * - * Brand tokens mirror video/data/brand.json which mirrors - * demo-output/landing-page/brand.json (committed Day 0). - */ - -@import url("https://fonts.googleapis.com/css2?family=Cormorant+Garamond:wght@400;500;600;700&family=Inter:wght@300;400;500;600;700&family=IBM+Plex+Sans+Condensed:wght@400;500;600&display=swap"); - -:root { - --brand-deep-teal: #0f4c5c; - --brand-warm-cream: #f7f1e8; - --brand-leaf-green: #3f7d58; - --brand-charcoal: #243034; - --brand-soft-gold: #c8a96a; - - --brand-font-heading: "Cormorant Garamond", "Playfair Display", serif; - --brand-font-body: "Inter", "Source Sans 3", sans-serif; - --brand-font-utility: "IBM Plex Sans Condensed", sans-serif; - - --brand-easing-enter: cubic-bezier(0.16, 1, 0.3, 1); - --brand-easing-exit: cubic-bezier(0.7, 0, 0.84, 0); - --brand-easing-spring: cubic-bezier(0.34, 1.56, 0.64, 1); - - --brand-shadow-card: - 0 1px 3px rgba(36, 48, 52, 0.1), 0 8px 24px rgba(36, 48, 52, 0.16), - 0 24px 60px rgba(36, 48, 52, 0.18); -} - -/* ============================================================ - * brand-title — POLISH SLOT (used by title-card, end-card) - * ============================================================ */ - -.brand-title { - font-family: var(--brand-font-heading); - color: var(--brand-charcoal); - letter-spacing: -0.012em; - line-height: 1.04; - margin: 0; - text-wrap: balance; -} - -.brand-title--xl { - font-size: 100px; - font-weight: 500; - letter-spacing: -0.022em; -} - -.brand-title--lg { - font-size: 72px; - font-weight: 500; - letter-spacing: -0.018em; -} - -.brand-title--md { - font-size: 56px; - font-weight: 500; -} - -.brand-subline { - font-family: var(--brand-font-utility); - font-size: 22px; - letter-spacing: 0.18em; - text-transform: uppercase; - color: var(--brand-deep-teal); - margin: 0; -} - -/* ============================================================ - * synthetic-disclaimer — used in every scene - * Three exact phrases from prompts/video-composition-session.md - * lines 64–74; opacity ≥0.88 always per honesty contract. - * ============================================================ */ - -.synthetic-disclaimer { - position: absolute; - bottom: 32px; - right: 32px; - font-family: var(--brand-font-utility); - font-size: 16px; - letter-spacing: 0.14em; - text-transform: uppercase; - color: var(--brand-charcoal); - opacity: 0.92; - background: rgba(247, 241, 232, 0.9); - padding: 10px 18px 10px 16px; - border: 1px solid rgba(36, 48, 52, 0.16); - border-left: 3px solid var(--brand-deep-teal); - border-radius: 2px; - box-shadow: 0 4px 16px rgba(36, 48, 52, 0.1); - z-index: 100; -} - -/* ============================================================ - * lp-pair-stage — paired scrolling LP cards (top zone) - * - * Layout: 50px top pad, 28px label row, 14px gap, 600px LP card, - * leaving room for the narrative panel below. Solo before-state - * scene centers a single 860w card; paired scenes show two side - * by side with 40px gap. - * ============================================================ */ - -.lp-pair-stage { - position: absolute; - top: 50px; - left: 0; - right: 0; - height: 642px; - display: flex; - justify-content: center; - gap: 40px; -} - -.lp-week-block { - display: flex; - flex-direction: column; - gap: 14px; - flex-shrink: 0; -} - -.lp-week-label { - height: 28px; - display: flex; - align-items: center; - gap: 12px; - font-family: var(--brand-font-utility); - font-size: 16px; - letter-spacing: 0.22em; - text-transform: uppercase; - color: var(--brand-charcoal); - font-weight: 600; - opacity: 0.88; -} - -.lp-week-label__chip { - display: inline-flex; - align-items: center; - padding: 5px 12px; - background: var(--brand-charcoal); - color: var(--brand-warm-cream); - border-radius: 2px; - font-size: 14px; - letter-spacing: 0.22em; - font-weight: 600; -} - -.lp-week-label__chip--current { - background: var(--brand-deep-teal); -} - -.lp-week-label__chip--final { - background: var(--brand-leaf-green); -} - -.lp-week-label__chip--dim { - background: var(--brand-charcoal); - opacity: 0.7; -} - -.lp-card { - width: 860px; - height: 600px; - background: #fff; - box-shadow: var(--brand-shadow-card); - border-radius: 8px; - overflow: hidden; - position: relative; -} - -.lp-card--dim { - filter: saturate(0.5) brightness(0.94); -} - -.lp-image { - display: block; - width: 100%; - height: auto; - position: absolute; - top: 0; - left: 0; - will-change: transform; -} - -/* ============================================================ - * narrative-panel — Data said / Council decided / Outcome - * - * 280px tall band below the LP stage. 2-column for the baseline - * (before-state, no decision yet) or 3-column for paired scenes. - * Outcome column carries the metric stat with optional ramp. - * ============================================================ */ - -.narrative-panel { - position: absolute; - bottom: 68px; - left: 80px; - right: 80px; - height: 280px; - display: flex; - align-items: stretch; - gap: 28px; -} - -.narrative-col { - flex: 1; - display: flex; - flex-direction: column; - gap: 16px; - padding: 28px 30px; - background: rgba(15, 76, 92, 0.05); - border-left: 4px solid var(--brand-deep-teal); - border-radius: 4px; -} - -.narrative-col--decision { - border-left-color: var(--brand-leaf-green); - background: rgba(63, 125, 88, 0.06); -} - -.narrative-col--outcome { - border-left-color: var(--brand-soft-gold); - background: rgba(200, 169, 106, 0.07); -} - -.narrative-col--solo { - flex: 1.6; -} - -.narrative-eyebrow { - font-family: var(--brand-font-utility); - font-size: 14px; - letter-spacing: 0.22em; - text-transform: uppercase; - color: var(--brand-deep-teal); - font-weight: 600; - margin: 0; -} - -.narrative-col--decision .narrative-eyebrow { - color: var(--brand-leaf-green); -} - -.narrative-col--outcome .narrative-eyebrow { - color: var(--brand-soft-gold); - filter: brightness(0.82); -} - -.narrative-body { - font-family: var(--brand-font-body); - font-size: 24px; - font-weight: 500; - line-height: 1.32; - color: var(--brand-charcoal); - margin: 0; -} - -.narrative-stat { - font-family: "JetBrains Mono", "IBM Plex Mono", monospace; - font-size: 64px; - font-weight: 600; - font-variant-numeric: tabular-nums; - color: var(--brand-soft-gold); - letter-spacing: -0.025em; - line-height: 1; - margin: 0; -} - -.narrative-stat--small { - font-size: 48px; -} - -.narrative-stat__delta { - font-family: var(--brand-font-body); - font-size: 18px; - font-weight: 500; - color: var(--brand-leaf-green); - margin: 0; - letter-spacing: -0.005em; -} - -.narrative-stat__delta--neutral { - color: var(--brand-charcoal); - opacity: 0.7; -} - -/* ============================================================ - * Generic scene primitives - * ============================================================ */ - -.scene { - position: absolute; - inset: 0; - width: 1920px; - height: 1080px; - background: var(--brand-warm-cream); - color: var(--brand-charcoal); - overflow: hidden; -} - -.scene__pad { - position: absolute; - inset: 80px; - display: flex; - flex-direction: column; - justify-content: center; -}