Conversation
- Fix dev/ → dev (trailing slash invalid in git branch names) - Add explicit Worktree + PR flow section: base off dev, PR to dev, merge to dev. Forge task branches + Claude Code feat/T<n>-<slug> pattern documented. - Hackathon rollup procedure: dev → main is a single atomic PR after T10 completes; production stays coherent until then. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New top-level section in EXPANSION-TASKS.md tells a fresh session to: - Read AGENTS.md first-actions list in full before coding - Start T0 immediately without confirmation - Stop after T0 (validate green + committed), report 3-5 lines to Richie, wait for green-light on T1 - From T1 onward: proceed without approval BUT post a 2-line "Starting T<n>" announcement with files-touched list before each task, so Richie has visibility to interrupt without blocking Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ci): install ImageMagick + Playwright Chromium in test job
The validate job's "Run tests" step was failing on dev with two
missing-binary errors:
- `magick` not in $PATH — visual-assets / heatmap rendering shells out
to ImageMagick. Ubuntu runners do not include it by default.
- Playwright Chromium executable missing — browser-audit tests launch
Chromium via Playwright. The bundled `playwright` package needs an
explicit `playwright install` to fetch the headless browser.
Adds both as steps before "Run tests" so the dev/main baseline goes
green again and downstream PRs stop inheriting the red.
* fix(ci): symlink magick -> convert for ImageMagick v6 compat
Ubuntu's `imagemagick` package ships v6 binaries (convert, mogrify),
not v7's unified `magick` entrypoint. scripts/build-demo-manifest.ts
calls `execFileSync("magick", ...)` so the v7 name is required.
Symlink `magick` -> `$(which convert)` after the apt install. The
chained-argv syntax we use is v6-compatible.
* fix(ci): install ImageMagick 7 portable binary instead of apt v6
The apt imagemagick package provides v6-only binaries. The repo uses
v7 multi-call dispatch (magick, magick identify, ...) which v6 cannot
satisfy via convert symlinks. Pull the v7 portable binary from the
official ImageMagick release and drop it into /usr/local/bin.
* fix(ci): use v6 + magick dispatcher wrapper instead of v7 download
Portable ImageMagick 7 binary URL was 404 — no canonical 'magick' static
release exists at imagemagick.org. Switch back to apt's v6 install and
add a small bash dispatcher at /usr/local/bin/magick that forwards to
the v6 binary based on the subcommand. Handles 'magick (...)',
'magick identify ...', and the other tools the repo uses.
* chore(video): add tracking dir for webster-video skill
CONTEXT.md is compaction-resistant (mission, locked decisions, critical
paths, current phase, don't-drift invariants, polish-slot index). Single
Read call restores alignment after autocompact or fresh session.
STATUS.md is the running tracker: phase, latest action, blockers, render
history, day log.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(video): hydrate 11 weeks of LP demo assets + add hydrate script
Day 0 of the webster-video skill plan. Copies weekly council-simulation
artifacts from local-runs/lp-council/w01-single-offer-visual-heatmap/
(simulation working copy) into demo-output/landing-page/wNN/ (committed
deliverable, the canonical handoff per T7/T9 task graph).
Per-week (w00-w10): desktop/mobile/tablet PNGs + matching heatmap SVGs +
analytics.json + heatmap.json + visual-review.md (w00 omits since it is
the pre-council baseline). Plus brand.json, agents.json, manifest.json
with hash + week list at the run root.
Also gitignores local-runs/ (now that the durable copy is in
demo-output/) and audio/*.raw.mp3 (only the Auphonic-leveled narration.mp3
is committed).
Unblocks remote planning surfaces (Ultraplan, fresh clones) which only
see committed files; the timelapse story now travels with the repo.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(video): scaffold HyperFrames project at video/
Installed @heygen-com/hyperframes skill bundle (5 sub-skills: gsap,
hyperframes, hyperframes-cli, hyperframes-registry, website-to-hyperframes)
into .agents/skills/, symlinked into .claude/skills/ for Claude Code.
Ran hyperframes init video --example blank to scaffold the project root
with index.html (1920x1080 master composition, GSAP timeline registered on
window.__timelines["main"]), hyperframes.json config, and the bundled
AGENTS.md / CLAUDE.md guides.
Pre-flight verified: node v25.9.0, ffmpeg 8.0.1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(video): structural draft — 7 scenes + master composition
Day 1 baseline of the webster-video skill plan. The structural draft
renders the full 130-second story end-to-end via HyperFrames sub-compositions,
with plain-but-working baselines that Claude Design will polish in Phase B.
Layout:
- video/index.html — master composition (1920x1080 at 130s) sequencing the 7
sub-compositions via data-composition-src
- video/compositions/{title-card,before-state,transformation,learning-beat,
recovery-arc,final-state,end-card}.html — one HyperFrames sub-composition
per storyboard beat. Each has data-composition-id, scoped GSAP timeline
registered on window.__timelines, and class="clip" elements with
data-start/data-duration/data-track-index per the framework contract.
- video/shared.css — brand tokens (deep teal, warm cream, leaf green,
charcoal, soft gold) + typography (Cormorant Garamond, Inter, IBM Plex
Sans Condensed) + 5 reusable component classes (brand-title,
synthetic-disclaimer, stat-counter, heatmap-overlay, council-ring) +
scene primitives. Single source of truth for visual primitives;
Claude Design will polish individual classes per the slot contract.
- video/data/{metrics,brand,council}.json — typed copies of the per-week CSV
from prompts/video-composition-session.md, brand tokens from
demo-output/landing-page/brand.json, and the 10-agent council roster.
- video/lib/{easings,trim-points}.js — shared GSAP easings and the 90s
social-cut frame-range definitions for the trim-points polish slot.
- video/script.md — 130s narration with beat-by-beat visual cues, callouts,
voice direction, and an honesty checklist tying every claim back to the
synthetic-data invariant.
- video/assets — symlink to ../demo-output/landing-page so HyperFrames
serves committed weekly artifacts via static paths.
Validation:
- npx hyperframes lint: 0 errors, 0 warnings across all 8 files
- npx hyperframes inspect: 0 layout issues across 9 timeline samples
Also gitignores the AI-platform symlinks installed by `skills add` (parallel
agent installs we don't use) and treats .agents/skills/ + .claude/skills/
like node_modules (re-installable via npx skills add heygen-com/hyperframes).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(video): update STATUS — Day 1 draft rendered; CONFIRMATION GATE
All 7 HyperFrames sub-compositions render with brand-aligned visuals:
0 lint errors, 0 layout issues, 7/7 scene snapshots passed visual check.
video/snapshots/ and video/renders/ added to .gitignore (regenerable).
Awaiting Richie's confirmation before Day 2 (audio chain, slot packets,
Claude Design polish, final render).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(video): polish round — type tightening, layered shadows, motion choreography
shared.css: layered card shadows via --brand-shadow-card, tightened brand-title
letter-spacing, refined synthetic-disclaimer with left-edge teal accent + soft
drop shadow, stat-counter value depth via text-shadow + tabular-nums, thicker
callout-chip accent border.
7 scene timelines: mask-reveal entrances on title-card + end-card via
clipPath inset; overshoot-stagger callouts on before/transformation/learning/
recovery/final via back.out(1.4); scale-on-entry for w04 + w09; PASS chip
stamps with back.out(1.7). Counter ramp on transformation kept as the single
metric story arc.
No HTML structure changes — data-* attributes, .clip class, and
window.__timelines registration preserved per HyperFrames contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(video): silent render + mux-narration script
Silent render: demo-output/videos/webster-lp-demo.silent.mp4
130s · 1920x1080 · 30fps · h264 · ~10 MB · rendered with hyperframes
-q high --strict in ~58s wall with 6 parallel workers.
mux-narration.ts: replaces planned auphonic-process.ts with ffmpeg
loudnorm (-16 LUFS) + bandpass + AAC mux. Auto-detects audio/narration.raw.{wav,mp3,m4a}, levels to audio/narration.mp3, muxes
silent mp4 + leveled audio into demo-output/videos/webster-lp-demo.mp4.
--skip-level flag for users who pre-level externally (Auphonic).
Uses execFileSync (no shell concat) per security hook.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(video): rebuild scenes — paired scrolling LPs + Data/Council/Outcome panel
Per user feedback after first silent render:
- Drop mobile screenshots; desktop only.
- Show full LP scrolling top-to-bottom over each scene's duration.
- Center the LP up top, place analytics + narrative below.
- Show week evolution by pairing last week | current week.
- Pinpoint what data said and how the council decided to react.
shared.css rev 2: drop screenshot-card, week-chip, callout-chip, heatmap-*,
council-ring*, stat-counter*. Add lp-pair-stage / lp-week-block / lp-week-label
(+ chip current/dim/final variants) / lp-card (860x600, overflow hidden) /
lp-card--dim (saturate 0.5 brightness 0.94) / lp-image (absolute, GSAP
translateY for scroll). Add narrative-panel / narrative-col (+ decision/
outcome variants) / narrative-eyebrow / narrative-body / narrative-stat
(+ --small variant for two-value displays) / narrative-stat__delta.
Pair scheme:
- before-state w00 solo (baseline narrative)
- transformation w01 | w02 (council's first transformation)
- learning-beat w03 | w04 (experiment + correction)
- recovery-arc w08 dim | w09 (failure + recovery)
- final-state w00 dim | w10 (full-arc bookend recap)
Scroll distances hardcoded per week from native 1440-wide screenshot heights:
526 / 2162 / 3362 / 3414 / 3727 / 3671 / 3599 / 3587 px. Linear ease across
scene minus 2s entry buffer. w00 short height creates intentional slow scroll
versus dense w10 fast scroll — visual story of "the page grew rich over 11
weeks". Counter ramps preserved on transformation (151→343) and final-state
(151→323).
title-card and end-card unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(video): build Claude Design polish bundle (7 slots, 1.3 MB)
Single-shot polish workflow: prepare 7 self-contained slot packets,
upload as zip to claude.ai/design, receive polished bundles, integrate
back via apply-polish-bundle.ts.
build-slot-packets.ts: generates skills/webster-video/polish-slots/<slot>/
for each of the 7 scenes (title-card, before-state, transformation,
learning-beat, recovery-arc, final-state, end-card). Per slot:
- slot.json (master frame range + fps/dimensions)
- brand.tokens.json (palette, typography, motion, honesty constraints,
HyperFrames contract reminder)
- baseline.html (current scene)
- baseline.css (full shared.css for context)
- baseline.png (mid-scene snapshot from video/snapshots/)
- acceptance.md (measurable "done" criteria + visual goals + "what can
change")
- DO_NOT_TOUCH.md (locked text, numbers, durations, scroll distances,
honesty framing, HyperFrames contract)
Plus polish-slots/README.md (operator workflow) and PROMPT.md (system
prompt to paste into claude.ai/design).
apply-polish-bundle.ts: reads polish-slots/<slot>/handoff/*.html (full
replacement) and optional handoff-shared/shared.css. Runs hyperframes
lint after, prints next steps. Does not auto-commit — review via
git diff. Uses execFileSync (no shell concat) per security hook.
.gitignore: added skills/webster-video/polish-slots/**/handoff/ and
handoff-shared/ and polish-slots.zip. Baseline slot files are committed;
polished handoff bundles stay local until reviewed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(video): free up Claude Design brief — only constraint is 150s ceiling
User pivot: rather than locking text, durations, honesty framing, and
HyperFrames-contract per scene, give Claude Design full autonomy with one
absolute constraint — total video ≤ 2.5 min (150s).
Changes:
- PROMPT.md rewritten as a free-form brief. Layout, scene count, scene
durations, copy, typography, motion language, color treatment — all
up to Claude Design.
- README.md slimmed to match.
- 14 per-slot constraint files removed (acceptance.md + DO_NOT_TOUCH.md
for each of 7 slots).
- brand.tokens.json stripped: palette + typography + motion easings +
card shadow only. No voice / design_direction / constraints / honesty
framing / HyperFrames-contract reminder text.
- slot.json field rename: master_*_seconds → current_master_*_seconds
(signals these are starting context, not specs).
- build-slot-packets.ts: added cleanStaleFiles() so future regenerations
remove acceptance.md / DO_NOT_TOUCH.md if reintroduced. Script body
shrank from 518 lines to ~200.
- polish-slots.zip regenerated (1.3 MB).
Net: -1303 / +150 lines. Bundle is now consistently free.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(video): prettier formatting on polish-bundle artifacts
No semantic changes — prettier expanded multi-line template literals in
build-slot-packets.ts and trimmed trailing whitespace in STATUS.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(video): surface demo-arc artifacts for judges; gitignore rendered mp4
Adds a "Demo arc artifacts" line under README "## For judges" pointing at the
11-week simulation council deliverables under demo-output/landing-page/, the
Managed Agents memory-store screenshot, and the local reproduce command.
Adds demo-output/videos/ to .gitignore. The rendered timelapse mp4 is hosted
externally for the hackathon submission rather than committed (avoids 70+ MB
binary in git history; the rendering pipeline + committed substrate already
proves reproducibility).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(video): silence lint — global window comment + interface defs
- video/lib/easings.js, trim-points.js: add /* global window */
(flat-config replacement for the deprecated /* eslint-env browser */).
- skills/webster-video/scripts/hydrate-demo-assets.ts: convert two
type-alias decls to interface form to satisfy
@typescript-eslint/consistent-type-definitions.
Closes the 6 lint errors blocking PR #13's CI.
* fix(video): markdown lint — ignore agent outputs, fix manual doc nits
- .markdownlint-cli2.jsonc: add demo-output/ and video/assets/ to
ignores. Both contain auto-generated visual-review.md files (same
category as history/, already ignored).
- Auto-fix MD031/MD032/MD026/MD034 in video/AGENTS.md, video/CLAUDE.md,
skills/webster-video/polish-slots/PROMPT.md.
- Add explicit language tag (`text` / `bash`) to two unfenced code
blocks in video/CLAUDE.md and skills/webster-video/polish-slots/README.md
to satisfy MD040.
Closes the 27 markdown lint errors blocking PR #13's CI.
* fix(test): bump browser-audit timeout to 30s for CI Chromium cold start
The Playwright screenshot test was timing out at bun:test's default 5s.
First Chromium launch in clean CI is 5–10s. The sibling test in
run-simulation.test.ts that exercises the same code path took 4.9s
on this run. Bumping to 30s gives margin without slowing happy-path.
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Trims Webster's agent surface to a single substrate ahead of the hackathon submission. Site-sim was committed but unused by the demo video. visual-design-critic was a runtime genealogy spawn with no id.txt registration and no production orchestrator references, so removing it brings production back to a clean 9-agent set that mirrors the 9-spec LP-sim set 1:1. Deletions: 9 site-sim agent specs, demo-sites/northwest-reno/, scripts/run-simulation-site.ts, agents/visual-design-critic.json, package.json sim:site, sim-agents.json + memory-stores.json site halves. run-simulation.ts collapses to lp-only literal types and a single-page screenshot path. register-sim-agents.ts/sim-preflight.ts expectations drop from 18→9 sim specs, 2→1 substrate. sim-council.md fan-out drops the SUBSTRATE/AGENT_SET site case and the licensing-and-warranty-critic arm. Affected tests rewritten to LP-only assertions; no tests deleted outright. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Splits the flat agents/*.json layout into two subdirectories so the operational role of each spec is obvious from the path: - agents/production/ — 9 specs that run Nicolette's live council - agents/simulation/ — 9 specs that drive the LP timelapse demo The two sets are now 1:1 symmetric. validate-agents.ts and the schema test recurse through agents/** so the strict Ajv schema gate is preserved without any weakening. Hardcoded paths in critic-genealogy.ts, planner-invoke.ts, and the affected tests are updated to the subdir layout. Genealogy spawns now write into agents/production/ so runtime-spawned critics land beside the rest of the production set. agents/AGENTS.md updated to describe the new shape; agents/CLAUDE.md follows via the existing symlink. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Updates the judge-facing surface and the canonical north-star docs
to match the new single-substrate, 9-agent shape:
- README pitch (7→9 agents), architecture diagram now shows the
full 7-session fan-out (5 critics + monitor + visual-reviewer),
repo layout reflects agents/{production,simulation}/, submission
notes report 9 specs and 175 tests
- AGENTS.md mission says single-substrate Richer Health LP demo
and points the do-not-touch rule at agents/production/
- context/VISION.md drops the dual-substrate framing: the demo arc
is LP-only, memory store count is 6 not 12, the API cost note
scales to one substrate, and the locked/out-of-scope sections
reflect the cut
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
refactor: trim agents to production/ + simulation/, drop site-sim
Section A polish — pre-judging cleanup. Moved internal-only docs to ~/Vault/Projects/webster/internal-tracking/ (preserved locally, removed from public tree). Untracked from context/: EXPANSION-TASKS.md, E2E-IMPLEMENTATION-TRACKER.md, SITE-FORK-CHECKLIST.md, ROADMAP.md, VIDEO-PLAN.md, VIDEO-PLAN-90s.md, v2-design.md Untracked from prompts/ (only first/second-wbs and sim-council stay public, matching what README documents): third-wbs-session.md, fourth-wbs-session.md, sim-audit-fix-session.md, composition-session.md, e2e-demo-run-session.md, sim-runner.md Untracked from history/: AGENTS.md (+ CLAUDE.md symlink) Untracked .forge/ralph/ — already gitignored for new dispatches; old PRDs were still in the index. Forge workflow YAMLs and config remain tracked. AGENTS.md updated to redirect EXPANSION-TASKS references to the vault path and drop stale doc references.
Implements the P0–P5 phase model locked in context/ONBOARDING-CASE-STUDY.md
(Q1–Q15). Skill is a thin shell over scripts/onboarding/* and the runtime
registration patterns in prompts/first-wbs-session.md, with status-file
resume and machine-checked phase exit gates.
- skills/webster-onboarding/SKILL.md — orchestration, P0 overview through
P5 first council, ! pre-load checks, hard rules on key handling
- references/{qa-bank,business-yaml-schema,key-handling,remediation,
empire-fixture}.md — detail split per skill-authoring conventions
- scripts/onboarding/verify-env.ts — reads .env.local, hits Anthropic /
GitHub / Cloudflare verify endpoints without echoing key values
- scripts/onboarding/verify-all.ts — rollup with --phase {p3,p4}, agent
count derived dynamically from agents/*.json non-sim specs
- scripts/onboarding/scaffold-repo.ts — creates GitHub repo + Astro
starter using brand identity from context/business.yaml
- package.json — wire onboarding:verify-env, :verify-all, :scaffold-repo
- context/ONBOARDING-CASE-STUDY.md — fix 9-vs-10 production agent drift
Smoke-tested verify-env (exit 1 on missing .env.local with clean hint)
and verify-all --phase p3 (per-check ✓/✗, exit 1 on real env gaps).
End-to-end skill invocation and scaffold-repo (side-effecting) untested
— follow up via Empire dry-run before recording.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README: add a "The `wbs` alias" section under Prerequisites so judges who clone the repo can replicate the dispatcher launcher (or run the equivalent `claude --settings ...` directly without aliasing). - Untrack `deploy/webster-dispatcher.plist`. The launchd plist hardcodes Richie's macOS user paths (`/Users/richiesakhon/...`) so it cannot be shipped publicly. Preserved in `~/Vault/Projects/webster/internal-tracking/deploy/`. - Run `bun run format` to fix prettier table-padding in `context/ONBOARDING-CASE-STUDY.md` (PR #12 CI format-check failure).
feat(skill): webster-onboarding v2 with verify scripts
Library-shape conversion of prompts/second-wbs-session.md so operators run the weekly council pass via /webster-weekly-council. SKILL.md is a slim phase index; nine references/ files hold per-phase bash blocks loaded on demand; two skill-local scripts/ extract the reusable polling helper and the planner-JSON parser. The 662-line single-page prompt remains intact as the locked source-of-truth runbook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update README, AGENTS, context/ARCHITECTURE, context/FEATURES, and history/AGENTS so the weekly-run operator surface is the new library skill (slash-command form) with prompts/second-wbs-session.md framed as the locked single-page runbook fallback. The prompt itself is unchanged and still locked by scripts/__tests__/sim-council.test.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat: webster-weekly-council library skill (route operators to /webster-weekly-council)
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (13)
📝 WalkthroughWalkthroughThis pull request consolidates a major architectural shift for the Webster project: it removes product requirements for multiple features (apply-worker CLI, genealogy governance, orchestrator memory planner, agent specs, and demo seeding), establishes a production roster of 9 managed agents mirrored by 9 simulation agents, introduces comprehensive simulation and onboarding infrastructure, and commits a complete 11-week demo landing page with analytics, heatmaps, and visual reviews. It also updates CI tooling and eliminates task planning documentation in favor of a shipped artifact-focused state. Changes
Sequence Diagram(s)sequenceDiagram
participant Operator as Operator
participant SimLp as run-simulation-lp
participant RunSim as runSimulation
participant SimCouncil as sim-council.md
participant Council as Managed<br/>Agent Council<br/>(9 agents)
participant MemStore as Memory Stores
participant CaptureBridge as sim-capture-bridge
participant Browser as browser-use
Operator->>SimLp: bun run sim:lp
SimLp->>RunSim: runSimulation(lpConfig)
loop for each week (0-10)
RunSim->>RunSim: Generate synthetic analytics
RunSim->>RunSim: Capture screenshots (Playwright)
RunSim->>SimCouncil: runCouncil(week, config)
rect rgba(0, 150, 200, 0.5)
SimCouncil->>Council: Create planner session
Council->>MemStore: Read council memory
Council-->>SimCouncil: Plan output
end
rect rgba(200, 100, 0, 0.5)
SimCouncil->>Council: Fan out monitor + 5 critics<br/>(parallel sessions)
Council->>MemStore: Read/attach per-role memory
Council-->>SimCouncil: Findings for each role
end
rect rgba(100, 150, 0, 0.5)
SimCouncil->>Council: Create redesigner session
Council-->>SimCouncil: Proposal + decision
end
rect rgba(150, 50, 150, 0.5)
SimCouncil->>Council: Create visual-reviewer session
Council-->>SimCouncil: Visual review markdown
end
RunSim->>RunSim: Write week artifacts<br/>(analytics, heatmap, review)
alt week is 1, 5, or 10
RunSim->>CaptureBridge: Emit CAPTURE_TRIGGER
CaptureBridge->>Browser: Spawn capture-mem-stores
Browser->>MemStore: Screenshot memory console
Browser-->>CaptureBridge: Proof artifact
end
end
RunSim-->>Operator: Return week HEADs + demo-output/
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
# Conflicts: # .gitignore # scripts/__tests__/browser-audit.test.ts # scripts/browser-audit.ts
There was a problem hiding this comment.
Actionable comments posted: 1
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
context/VISION.md (1)
107-110:⚠️ Potential issue | 🟡 MinorRemove the remaining two-substrate execution-plan text.
These lines still talk about 18 sim agents, per-substrate invocations, and “both” simulations, which conflicts with the single-LP scope established above and reintroduces deleted scope into the canonical vision doc.
Also applies to: 154-155
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@context/VISION.md` around lines 107 - 110, Remove the leftover two-substrate execution-plan text that reintroduces deleted scope: delete or revise the phrases referencing "T2 (18 sim agent specs)", "T8 (per-substrate invocations)", and "both simulations" (and the similar lines at 154-155) so the Day 2/Day 3 plan reflects the single-LP scope; replace with a short single-LP equivalent (e.g., consolidate into a single simulation/task line) and ensure any tokens like T2/T8 are updated or removed to avoid implying multi-substrate execution.AGENTS.md (1)
5-10:⚠️ Potential issue | 🟡 MinorFix the agent-spec locations in the operator guide.
This still says the runtime critics live in
agents/*.json, but this PR moves them intoagents/production/andagents/simulation/. Leaving the old path here makes the safety boundary much easier to miss.As per coding guidelines:
AGENTS.md: Document agent implementations and capabilities in AGENTS.md.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@AGENTS.md` around lines 5 - 10, Update the operator guide in AGENTS.md to replace the outdated runtime-critic path `agents/*.json` with the new locations `agents/production/` and `agents/simulation/`, and adjust the sentence that references "Claude Managed Agents (specs in `agents/*.json`)" to clearly point to those two directories; keep the rest of the paragraph distinguishing "Implementation operators" from "Runtime critics" and retain the cross-reference to `skills/webster-lp-audit/SKILL.md` for runtime critic guidance.
🟠 Major comments (27)
demo-output/landing-page/w07/analytics.json-14-16 (1)
14-16:⚠️ Potential issue | 🟠 MajorFix CTA click count mismatch between aggregate and persona totals
cta_clicks.discovery_callis 331 (Line 15), while personacta_clickssum to 332 (131+111+90). This makes the artifact internally inconsistent and unreliable for trend/validation tooling.Proposed fix (pick one source of truth and align)
"cta_clicks": { - "discovery_call": 331 + "discovery_call": 332 },Based on learnings: Do not fabricate analytics numbers or business stats.
Also applies to: 17-38
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo-output/landing-page/w07/analytics.json` around lines 14 - 16, The aggregate cta_clicks value is inconsistent with the persona totals: cta_clicks.discovery_call is 331 while the sum of the persona cta_clicks (131+111+90) is 332; pick a single source of truth and make them match—either update cta_clicks.discovery_call to 332 to reflect the persona totals, or adjust the persona counts so their sum equals 331—and apply the same reconciliation to the other similar cta_clicks blocks referenced in the file (the persona cta_clicks sections).demo-output/landing-page/w07/analytics.json-92-95 (1)
92-95:⚠️ Potential issue | 🟠 MajorUse a consistent event metric key (
cta_clicks)Line 93 uses
cta_click(singular), but the payload usescta_clickselsewhere (Line 14, Line 22/29/36). This can break downstream parsing/grouping keyed on metric names.Proposed fix
- "metric": "cta_click", + "metric": "cta_clicks",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo-output/landing-page/w07/analytics.json` around lines 92 - 95, The metric key in the JSON snippet uses "cta_click" but the rest of the payload uses "cta_clicks", so update the "metric" field value from "cta_click" to "cta_clicks" (the "metric" property in the object shown with version_sha "synthetic-lp-w07-f52d97d3-ui-adjusted") to match other occurrences (lines showing "cta_clicks" at positions referenced) and ensure downstream parsers aggregate correctly; verify no other objects still use the singular form.scripts/emit-memory-screenshot-manifest.ts-38-39 (1)
38-39:⚠️ Potential issue | 🟠 MajorDo not persist absolute filesystem paths in manifest entries.
Lines 39 and 52 serialize machine-local absolute paths into versioned artifacts, which makes manifests non-portable and leaks local directory structure.
Proposed fix (store root-relative paths, keep absolute only for stat)
-import { dirname, resolve } from "node:path"; +import { dirname, join, resolve } from "node:path"; @@ - const path = resolve(dir, file); - screenshots.push({ substrate, week, path, bytes: statSync(path).size }); + const absolutePath = resolve(dir, file); + screenshots.push({ + substrate, + week, + path: join(substrate, file), + bytes: statSync(absolutePath).size, + }); @@ - const path = resolve(manualDir, file); - return { substrate: "manual", week: null, path, bytes: statSync(path).size }; + const absolutePath = resolve(manualDir, file); + return { + substrate: "manual", + week: null, + path: join("manual", file), + bytes: statSync(absolutePath).size, + };Also applies to: 51-53
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/emit-memory-screenshot-manifest.ts` around lines 38 - 39, The manifest entries currently store machine-local absolute paths (variable path built from resolve(dir, file)) which leaks local filesystem layout; change the logic that pushes into screenshots so you call statSync using the absolute resolved path but store a root-relative path string instead (e.g., compute relPath = relative(manifestRoot, path) or path.relative(dir, file) and push { substrate, week, path: relPath, bytes: statSync(path).size } into screenshots), and apply the same change for the second occurrence that serializes paths so the manifest contains relative paths while stat still uses the absolute resolved path.demo-output/landing-page/w03/analytics.json-14-16 (1)
14-16:⚠️ Potential issue | 🟠 MajorCorrect the CTA count inconsistency in this week’s analytics.
Line 15 is
331, while persona CTA clicks on Lines 22, 29, and 36 total332. Line 94 repeats331, so aggregate/event and persona data disagree.Also applies to: 17-38, 91-95
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo-output/landing-page/w03/analytics.json` around lines 14 - 16, The aggregate CTA count under "cta_clicks.discovery_call" is inconsistent with the sum of persona CTA clicks (aggregate shows 331 while persona entries total 332); update the JSON so "cta_clicks.discovery_call" (and any repeated aggregate occurrences) match the persona-level total (or adjust the persona entries if the correct total is 331), ensuring all instances of the aggregate key "cta_clicks.discovery_call" and the repeated aggregate value near the end of the file are made consistent with the persona CTA counts.demo-output/landing-page/w04/analytics.json-14-16 (1)
14-16:⚠️ Potential issue | 🟠 MajorReconcile CTA totals across persona and aggregate fields.
Line 15 reports
344, but persona CTA clicks on Lines 22, 29, and 36 add up to343. Line 94 mirrors the aggregate344, so this fixture currently has conflicting counts.Also applies to: 17-38, 91-95
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo-output/landing-page/w04/analytics.json` around lines 14 - 16, The aggregate CTA count "cta_clicks.discovery_call" (344) conflicts with the sum of persona CTA clicks (which totals 343); update the fixture so the aggregate equals the sum of the persona fields: compute the sum of the persona keys under CTA (lines 17-38) and set "cta_clicks.discovery_call" and its mirrored aggregate (lines 91-95) to that computed total (or, if the aggregate is correct, adjust the persona entry values to sum to 344) so all three places are consistent.demo-output/landing-page/w01/analytics.json-14-16 (1)
14-16:⚠️ Potential issue | 🟠 MajorFix CTA totals mismatch between aggregate and persona/event metrics.
Line 15 reports
314, but persona CTA clicks on Lines 22, 29, and 36 sum to315. Line 94 also repeats314. This fixture is internally inconsistent and should be regenerated from a single source of truth.Also applies to: 17-38, 91-95
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo-output/landing-page/w01/analytics.json` around lines 14 - 16, The CTA totals in the fixture are inconsistent: the aggregate key "cta_clicks.discovery_call" (value 314 at the top and repeated later) does not equal the sum of persona CTA clicks (which total 315). Regenerate or recalc the fixture from the single source of truth and update every instance of "cta_clicks.discovery_call" (both the aggregate entry and the repeated value near the end) so the aggregate equals the sum of the persona/event breakdowns; ensure any other related blocks in the same JSON (the ranges mentioned around the persona/event sections) are updated to the same authoritative value.history/lp-demo/w01/analytics.json-7-7 (1)
7-7:⚠️ Potential issue | 🟠 MajorPersona session totals do not match top-level
sessions.Line 7 reports
sessions: 5034, but persona sessions sum to5035(1863 + 1712 + 1460). Please reconcile this before merge to keep weekly artifacts internally consistent.Also applies to: 20-21, 27-28, 34-35
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@history/lp-demo/w01/analytics.json` at line 7, The top-level "sessions" value (currently 5034) is inconsistent with the persona totals (1863 + 1712 + 1460 = 5035); update the document so the "sessions" key matches the sum of persona session counts (or adjust the persona counts if those are wrong) and apply the same reconciliation to the other weeks called out (weeks containing the similar mismatches at the other instances), ensuring the top-level "sessions" value equals the sum of the persona entries.demo-output/landing-page/w02/heatmap.json-105-117 (1)
105-117:⚠️ Potential issue | 🟠 MajorSection identifier drift (
contactvscta) risks broken joins.These regions are labeled as
contact, but their metric reasons clearly referencectastats. Align IDs/labels with the analytics section key to prevent mapping mismatches.Proposed consistency fix
- "id": "contact", - "label": "contact", + "id": "cta", + "label": "cta", ... - "id": "contact", - "label": "contact", + "id": "cta", + "label": "cta", ... - "id": "contact", - "label": "contact", + "id": "cta", + "label": "cta",Also applies to: 205-216, 304-315
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo-output/landing-page/w02/heatmap.json` around lines 105 - 117, Some heatmap sections have mismatched identifiers: the JSON objects with "id": "contact" and "label": "contact" include "reason" fields that reference "cta" metrics (e.g., reason: "cta: views=..."), which will break analytics joins; update those objects (the ones containing "id": "contact" / "label": "contact") to consistently use the analytics key "cta" (change "id" and "label" to "cta") OR change the "reason" metric prefix to "contact" so the id/label and reason match, and apply the same fix to the other similar objects noted in the diff.demo-output/landing-page/w05/analytics.json-15-15 (1)
15-15:⚠️ Potential issue | 🟠 MajorCTA aggregate conflicts with persona click totals.
Top-level
discovery_callclicks are332(Line 15), while persona clicks total334(129 + 113 + 92). This should be reconciled to avoid inconsistent analytics rollups.Also applies to: 22-23, 29-30, 36-37
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo-output/landing-page/w05/analytics.json` at line 15, The top-level "discovery_call" value (332) does not match the sum of persona-level clicks (129 + 113 + 92 = 334); update the analytics generation so the top-level "discovery_call" is computed from the persona entries (or vice versa) to keep rollups consistent: locate the "discovery_call" top-level key and the persona click objects in this JSON, recompute the aggregate as the sum of the persona click counts (or deduplicate overlapping counts if clicks are double-counted) and replace the mismatched 332 with the correct aggregated value, and ensure the same fix is applied for the other mismatched pairs referenced (lines with the same pattern).demo-landing-page/ugly/index.html-53-60 (1)
53-60:⚠️ Potential issue | 🟠 MajorRemove or source the hard-coded business statistics.
The numeric claims in this block are presented as factual metrics without attribution. If they are synthetic placeholders, mark them clearly as synthetic or replace with non-quantified copy.
Based on learnings: Do not fabricate analytics numbers or business stats.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo-landing-page/ugly/index.html` around lines 53 - 60, The three hard-coded stats in the article block (the <strong> elements showing "3×", "67%", and "$240K") must be removed or clearly sourced: either replace those numeric values with non-quantified copy (e.g., "improved retention", "patients leave when care feels inconsistent", "lost revenue per practitioner") or add a citation/footnote and label them as synthetic/placeholders; update the three corresponding <article> elements containing those <strong> nodes so they no longer present unattributed factual metrics without sourcing.agents/simulation/webster-lp-sim-redesigner.json-5-5 (1)
5-5:⚠️ Potential issue | 🟠 MajorAdd a dedicated
# Scopeblock to the redesigner system prompt.Line 5 defines reads/tasks/outputs, but it does not explicitly declare exact ownership boundaries versus other agents in a Scope section.
As per coding guidelines: "Include a scope section in system prompts that EXACTLY states what this agent is responsible for, with no overlap with other agents."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agents/simulation/webster-lp-sim-redesigner.json` at line 5, The system prompt in the "system" string for Webster's lp simulation redesigner is missing an explicit "# Scope" block that defines exact ownership boundaries; add a dedicated "# Scope" section near the top of that "system" string (inside agents/simulation/webster-lp-sim-redesigner.json) that enumerates this agent's sole responsibilities (e.g., reads: specific files listed, task: produce proposal.md and decision.json, outputs: push/create files via GitHub MCP) and explicitly states what other agents or systems must NOT do (no git, no external fetches, no overlapping judging duties with critics/monitor), ensuring wording matches the existing instructions and avoids overlap with other agents.agents/simulation/webster-lp-sim-planner.json-5-5 (1)
5-5:⚠️ Potential issue | 🟠 MajorAdd an explicit
# Scopesection to the planner system prompt.Line 5 has Bootstrap/Task/Output, but no dedicated Scope block that defines planner-only ownership and explicit out-of-scope boundaries.
As per coding guidelines: "Include a scope section in system prompts that EXACTLY states what this agent is responsible for, with no overlap with other agents."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agents/simulation/webster-lp-sim-planner.json` at line 5, Add a dedicated "# Scope" section to the "system" value in webster-lp-sim-planner.json that explicitly enumerates this agent's responsibilities (e.g., choosing weekly experiment direction, reading specified files via GitHub MCP, producing the plan.md JSON with fields classification/next_action/direction_hint/optional new_critic_request/rationale) and lists clear out-of-scope boundaries (e.g., no git operations, no file IO outside GitHub MCP, no publishing changes, no critic decision-making, and no external network fetches); place it near the top of the prompt (alongside Bootstrap/Task/Output) so it is plainly visible and refers to the same planner role and the required reads (demo-landing-page/context/* and context/sim/lp/planner/notes.md) to avoid overlap with other agents.scripts/onboarding/verify-env.ts-57-100 (1)
57-100:⚠️ Potential issue | 🟠 MajorAdd request timeouts to provider verification calls.
The fetch calls on lines 57, 77, and 98 lack timeouts, allowing the onboarding process to hang indefinitely if network requests stall.
Proposed fix
+const VERIFY_TIMEOUT_MS = 10_000; + +function fetchWithTimeout(url: string, init: RequestInit): Promise<Response> { + return fetch(url, { ...init, signal: AbortSignal.timeout(VERIFY_TIMEOUT_MS) }); +} + async function verifyAnthropic(key: string): Promise<VerifyResult> { @@ - const res = await fetch("https://api.anthropic.com/v1/models", { + const res = await fetchWithTimeout("https://api.anthropic.com/v1/models", { headers: { "x-api-key": key, "anthropic-version": "2023-06-01", }, }); @@ async function verifyGitHub(token: string): Promise<VerifyResult> { @@ - const res = await fetch("https://api.github.com/user", { + const res = await fetchWithTimeout("https://api.github.com/user", { headers: { Authorization: `Bearer ${token}`, Accept: "application/vnd.github+json", "X-GitHub-Api-Version": "2022-11-28", }, }); @@ async function verifyCloudflare(token: string): Promise<VerifyResult> { @@ - const res = await fetch("https://api.cloudflare.com/client/v4/user/tokens/verify", { + const res = await fetchWithTimeout("https://api.cloudflare.com/client/v4/user/tokens/verify", { headers: { Authorization: `Bearer ${token}` }, });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/onboarding/verify-env.ts` around lines 57 - 100, Add a per-request timeout using AbortController for the three provider verification fetches (the Anthropic fetch block in verifyAnthropic, verifyGitHub, and verifyCloudflare): create an AbortController, start a setTimeout to call controller.abort() after a reasonable timeout (e.g., 5000ms), pass controller.signal to fetch, and ensure you clear the timeout (clearTimeout) in a finally block so it doesn’t leak; preserve existing error handling but ensure aborted requests surface as fetch errors so the existing catch returns a failed VerifyResult.demo-landing-page/ugly/style.css-86-87 (1)
86-87:⚠️ Potential issue | 🟠 MajorAvoid hotlinking hero media from an external URL.
Loading the hero image from Unsplash makes rendering dependent on third-party availability and network policy, which can cause flaky captures and nondeterministic output.
Proposed fix
- url("https://images.unsplash.com/photo-1624616802182-57737fa83971?w=1920&h=1080&fit=crop&q=80&auto=format") - center/cover; + url("./assets/hero-background.webp") center/cover;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo-landing-page/ugly/style.css` around lines 86 - 87, The CSS is hotlinking an external Unsplash URL (url("https://images.unsplash.com/...")) for the hero background which causes flaky renders; download and check the image into the repo (e.g., demo-landing-page/assets/ or static/), update the style.css rule that currently uses url("https://images.unsplash.com/...") to reference a relative local path (e.g., url("/assets/hero.jpg")) and ensure your build/static assets pipeline serves that file so the hero background uses the bundled asset instead of the external URL.agents/simulation/webster-lp-sim-visual-reviewer.json-5-5 (1)
5-5:⚠️ Potential issue | 🟠 MajorAdd an explicit
# Scopesection to the system prompt.This prompt has bootstrap/task/output, but it does not include a dedicated scope block that explicitly bounds responsibilities and overlap, which is required for agent-spec compliance.
As per coding guidelines: "Include a scope section in system prompts that EXACTLY states what this agent is responsible for, with no overlap with other agents."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agents/simulation/webster-lp-sim-visual-reviewer.json` at line 5, The system prompt string in webster-lp-sim-visual-reviewer.json is missing a dedicated "# Scope" section; update the "system" value (the long prompt text) to add a clear "# Scope" block that explicitly and concisely states only this agent’s responsibilities (visual review of LP screenshots, required GitHub MCP reads, judging against brand/personas, and writing the visual-review.md) and explicitly excludes any other agents’ duties (e.g., editing content, running git, or making network requests); place the block near the top of the existing prompt so functions like the bootstrap/task/output sections remain, and ensure the language matches the agent-spec requirement of EXACTLY stating responsibilities with no overlap.scripts/onboarding/scaffold-repo.ts-246-258 (1)
246-258:⚠️ Potential issue | 🟠 MajorThe fixed temp worktree breaks reruns after a partial scaffold.
tmp/onboarding-scaffold/<repoName>is reused across runs, so any leftover.gitmetadata from a prior attempt makesgit remote add originand later commands fail. That means the advertised idempotent path is not actually repeatable for the same repo name.Also applies to: 307-312
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/onboarding/scaffold-repo.ts` around lines 246 - 258, The commitAndPush function fails on reruns because leftover .git metadata in the reused tmp/onboarding-scaffold/<repoName> directory prevents git remote add/push; before running git commands, detect and remove any existing .git directory (e.g. using fs.existsSync(path.join(workDir, ".git")) and fs.rmSync(..., { recursive: true, force: true })) so git init can run cleanly, then proceed with run(["git", "init", ...]) etc.; apply the same pre-check-and-remove logic to the other scaffold git block referenced around the later occurrence (the block at 307-312) so repeated runs are idempotent.scripts/register-sim-agents.ts-38-42 (1)
38-42:⚠️ Potential issue | 🟠 MajorValidate the full sim-spec set before the first API write.
Right now filename filtering is the only preflight. If one JSON has a bad
nameor other malformed shape, this loop can register some agents successfully and only fail afterward while deriving the manifest, leaving a partial remote state behind. Parse/validate the entire set up front, then start the POSTs.Based on learnings: Before committing a new agent spec: validate against schema, ensure registration is idempotent, and check name collision across all specs in both agents sets.
Also applies to: 89-98
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/register-sim-agents.ts` around lines 38 - 42, The loadSimAgentSpecs function currently streams JSON files into POSTs without full preflight validation; update loadSimAgentSpecs to read and parse all candidate files first, validate each parsed object against the AgentSpec schema (ensure required fields like name, id, etc.), accumulate and fail fast on any validation errors, detect duplicate names within the parsed set, then perform idempotent registration POSTs (use a safe upsert or check-exists before create) and check for name collisions against the existing remote agents before any network writes; also apply the same preflight validation logic to the registration loop referenced around the register/POST logic (the code block around lines ~89-98) so the manifest derivation and remote writes only run after the complete validated set is confirmed.scripts/onboarding/scaffold-repo.ts-196-199 (1)
196-199:⚠️ Potential issue | 🟠 MajorUpdate scaffold dependencies to Astro 6.
This generator creates projects with
astro@^4.0.0and@astrojs/cloudflare@^9.0.0, violating the locked stack (Astro 6). Update toastro@^6.0.0and@astrojs/cloudflare@^13.0.0to align with project standards.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/onboarding/scaffold-repo.ts` around lines 196 - 199, Update the hardcoded dependency versions in the dependencies object: change "astro" from "^4.0.0" to "^6.0.0" and change "@astrojs/cloudflare" from "^9.0.0" to "^13.0.0" so the scaffolded projects use Astro 6; locate the dependencies object (the dependencies symbol in scaffold-repo.ts) and replace the version strings for the astro and `@astrojs/cloudflare` entries accordingly.context/ONBOARDING-CASE-STUDY.md-25-27 (1)
25-27:⚠️ Potential issue | 🟠 MajorUpdate this spec to the current production agent inventory.
This doc still says onboarding provisions/verifies 10 production agents via
agents/*.json, but the repo contract in this PR is 9 protected production specs underagents/production/. If someone follows this literally, the verify step and the narration both drift from the actual production set.Based on learnings: The 9 specs in
agents/production/are the production set for Nicolette's real weekly council — do not modify them.Also applies to: 54-55, 130-137, 217-223
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@context/ONBOARDING-CASE-STUDY.md` around lines 25 - 27, The spec text incorrectly claims onboarding provisions/verifies 10 production agents via agents/*.json; update every referenced count to reflect the actual 9 protected production specs present under agents/production/ (e.g., change "10 production agents" → "9 production agents" in the Q3–Q5 table and the other occurrences you noted around lines 54–55, 130–137, and 217–223), keep the repository's 9 JSON specs untouched (do not modify agents/production/*), and ensure any narration or verification steps (mentions of agents/*.json, "production set", or verify step) consistently reference the 9-agent inventory.demo-output/landing-page/w04/visual-review.md-6-13 (1)
6-13:⚠️ Potential issue | 🟠 MajorKeep this bundle on the
week-NNcontract.The new demo-manifest pipeline only scans directories named
week-\\d{2}. Ademo-output/landing-page/w04/...artifact will be invisible tobuild-demo-manifest.ts, so this review bundle will be skipped from the manifest/final-sheet flow.Based on learnings: Structure new hackathon simulation output as
demo-output/landing-page/week-NN/...following the asset-bundle contract defined incontext/EXPANSION-TASKS.mdT7–T9.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo-output/landing-page/w04/visual-review.md` around lines 6 - 13, The bundle is placed at demo-output/landing-page/w04 which doesn't match the pipeline's week-\d{2} scan; move or rename the artifact directory to follow the asset-bundle contract so it becomes demo-output/landing-page/week-04/... (or create a week-04 symlink) and ensure any generated references (manifest entries) point to the new path; if you prefer changing code instead, update the scanner in build-demo-manifest.ts to accept the current pattern, but the preferred fix is to restructure output to demo-output/landing-page/week-NN per context/EXPANSION-TASKS.md T7–T9 so the bundle is discovered by build-demo-manifest.ts.scripts/build-demo-manifest.ts-240-246 (1)
240-246:⚠️ Potential issue | 🟠 MajorValidate
final_sheetwith the other top-level absolute paths.
DEMO_MANIFEST_SCHEMArequiresfinal_sheetto be absolute, butvalidateDemoManifest()never checks it. A caller can pass a relativefinal_sheet, get a clean validation result, and only fail later when a consumer treats the manifest as schema-valid.Proposed fix
if ( !manifest.substrate || !isAbsolute(manifest.output_dir) || - !isAbsolute(manifest.manifest_path) + !isAbsolute(manifest.manifest_path) || + !isAbsolute(manifest.final_sheet) ) { throw new Error("manifest paths must be absolute and substrate must be present"); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/build-demo-manifest.ts` around lines 240 - 246, The validation block that currently checks manifest.substrate, manifest.output_dir, and manifest.manifest_path must also validate manifest.final_sheet is present and absolute; update the condition (in validateDemoManifest / the manifest validation block that throws "manifest paths must be absolute and substrate must be present") to include !isAbsolute(manifest.final_sheet) (and check presence if needed) so DEMO_MANIFEST_SCHEMA's requirement for an absolute final_sheet is enforced at validation time.scripts/__tests__/register-sim-agents.test.ts-34-35 (1)
34-35:⚠️ Potential issue | 🟠 MajorUse
addFormats(ajv)instead ofaddFormats.default(ajv).The import
import addFormats from "ajv-formats"provides the default export directly. Calling.defaulton it bypasses the intended API and relies on module-interop quirks, which can cause runtime errors.Proposed fix
- addFormats.default(ajv); + addFormats(ajv);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/__tests__/register-sim-agents.test.ts` around lines 34 - 35, The test creates an Ajv2020 instance as ajv and currently calls addFormats.default(ajv), which relies on interop quirks; change the call to use the default export directly by invoking addFormats(ajv) instead. Locate the Ajv2020 instantiation (const ajv = new Ajv2020(...)) and replace the addFormats.default usage with addFormats(ajv) so the addFormats import is used via its intended API.scripts/__tests__/register-sim-agents.test.ts-76-151 (1)
76-151:⚠️ Potential issue | 🟠 MajorAdd assertions for the critical API contract: beta header and pagination.
The tests prove create-vs-reuse behavior but have two gaps that would allow regressions:
Header validation missing: No test asserts that
anthropic-beta: managed-agents-2026-04-01is sent with requests; removing this required header would not fail the tests.Pagination coverage missing: Both test mocks return
has_more: false, so iffindAgentByName()stops handlingnext_pageorlast_idcursors, the tests would still pass. Add a test case that mockshas_more: truewithnext_pageto exercise the loop infindAgentByName().🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/__tests__/register-sim-agents.test.ts` around lines 76 - 151, The tests in register-sim-agents.test.ts miss asserting the required API header and pagination behavior; update the existing fetch mocks (used when testing registerSimAgents and findAgentByName) to assert that requests include the header "anthropic-beta: managed-agents-2026-04-01" (verify init.headers or RequestInit passed into globalThis.fetch) and add a new test (or extend an existing one) that simulates paginated responses by returning { data: [...], has_more: true, next_page: "cursor1" } on the first GET and a final page with has_more: false on the second GET so findAgentByName's loop is exercised; ensure the POST path still checks for header when creating agents and keep using loadSimAgentSpecs(), registerSimAgents, and the same globalThis.fetch override pattern to locate the logic.scripts/run-simulation.ts-181-198 (1)
181-198:⚠️ Potential issue | 🟠 MajorFail the run when a memory-summary write is rejected.
These POSTs ignore
res.ok. A 401/404/5xx leaves the simulation green while no summary document is persisted, which breaks the state this step is supposed to carry across weeks.Based on learnings: Do not silently catch errors to make things look green; surface `[STUCK]` prefix if a path is not clear.Proposed fix
- await fetch(`${API}/memory_stores/${storeId}/documents`, { + const res = await fetch(`${API}/memory_stores/${storeId}/documents`, { method: "POST", headers: { "x-api-key": apiKey, "anthropic-version": VERSION, "anthropic-beta": BETA, @@ }, }), }); + if (!res.ok) { + throw new Error( + `memory summary write failed for ${role} (${res.status}): ${await res.text()}`, + ); + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/run-simulation.ts` around lines 181 - 198, The POST to `${API}/memory_stores/${storeId}/documents` currently ignores the HTTP response; change the call in scripts/run-simulation.ts so you await the fetch result into a variable (the fetch that posts summaries[role]) and check res.ok — if not ok, read the response text/JSON and throw an Error that includes the status and body so the run fails; ensure the thrown error or logged message includes the “[STUCK]” prefix when the write path is unclear (so the simulation surface is red/failed rather than silently passing).scripts/run-simulation.ts-91-96 (1)
91-96:⚠️ Potential issue | 🟠 MajorDo not forward
ANTHROPIC_API_KEYintoprompts/sim-council.md.The spawned env inherits the parent
ANTHROPIC_API_KEY, but the prompt aborts when that variable is exported. In the default path, enabling memory-summary writes via env breaks the council step immediately.Based on learnings: Applies to `prompts/sim-council.md` : Use `prompts/sim-council.md` as the simulation orchestrator (a fork of the production orchestrator) for hackathon expansion.Proposed fix
function defaultRunCouncil( env: Record<string, string>, command = "bun scripts/run-markdown-bash.ts prompts/sim-council.md", ): void { - execFileSync("bash", ["-lc", command], { env: { ...process.env, ...env }, stdio: "inherit" }); + const childEnv = { ...process.env, ...env }; + delete childEnv.ANTHROPIC_API_KEY; + execFileSync("bash", ["-lc", command], { env: childEnv, stdio: "inherit" }); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/run-simulation.ts` around lines 91 - 96, defaultRunCouncil currently forwards the parent ANTHROPIC_API_KEY into the child process, which causes prompts/sim-council.md to abort; fix by creating a child environment object (merge process.env and the passed env) and explicitly remove/undefine the ANTHROPIC_API_KEY key before calling execFileSync; update the defaultRunCouncil implementation (referencing function defaultRunCouncil, execFileSync, command, and prompts/sim-council.md) to pass that sanitized env to the child process.scripts/context-schema.ts-153-161 (1)
153-161:⚠️ Potential issue | 🟠 MajorConvert file-read / JSON-parse failures into validation errors.
validateContextDirectory()throws onENOENTor malformedpersonas.json/brand.json, so the CLI exits before printing the per-directory report. Return those as collected errors instead of crashing the whole validation pass.Proposed fix
export function validateContextDirectory(contextDir: string): string[] { - const business = readFileSync(`${contextDir}/business.md`, "utf8"); - const personas = JSON.parse(readFileSync(`${contextDir}/personas.json`, "utf8")); - const brand = JSON.parse(readFileSync(`${contextDir}/brand.json`, "utf8")); - return [ - ...validateBusinessMarkdown(business), - ...validatePersonas(personas), - ...validateBrandContext(brand), - ]; + const errors: string[] = []; + let business: string | undefined; + let personas: unknown; + let brand: unknown; + + try { + business = readFileSync(`${contextDir}/business.md`, "utf8"); + } catch (error) { + errors.push(`business.md could not be read: ${(error as Error).message}`); + } + try { + personas = JSON.parse(readFileSync(`${contextDir}/personas.json`, "utf8")); + } catch (error) { + errors.push(`personas.json is missing or invalid JSON: ${(error as Error).message}`); + } + try { + brand = JSON.parse(readFileSync(`${contextDir}/brand.json`, "utf8")); + } catch (error) { + errors.push(`brand.json is missing or invalid JSON: ${(error as Error).message}`); + } + + return [ + ...errors, + ...(business ? validateBusinessMarkdown(business) : []), + ...(personas === undefined ? [] : validatePersonas(personas)), + ...(brand === undefined ? [] : validateBrandContext(brand)), + ]; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/context-schema.ts` around lines 153 - 161, The function validateContextDirectory currently throws on missing or malformed files; instead wrap the reads and JSON.parse calls for business.md, personas.json, and brand.json inside try/catch blocks in validateContextDirectory so any ENOENT or parse errors are caught and converted into returned validation error messages (e.g., push human-readable strings into the returned array). Keep calling validateBusinessMarkdown(business), validatePersonas(personas), and validateBrandContext(brand) when a file successfully loads, but if a read/parse fails for a given file, add a clear error entry to the result array describing which file failed and the error message rather than letting the exception propagate.scripts/capture-mem-stores.ts-61-84 (1)
61-84:⚠️ Potential issue | 🟠 MajorValidate
console_urlandoutputbefore driving a real browser profile.This accepts any URL and any destination path. Because the script opens that URL in a logged-in default browser profile and writes to the requested file, a forged
CAPTURE_TRIGGERcan browse arbitrary sites and overwrite arbitrary files. Please restrictconsole_urlto the Anthropic memory-stores page andoutputto the expected screenshot directory.Proposed fix
function parsePayload(raw: string): CaptureTriggerPayload { const parsed = JSON.parse(raw) as Partial<CaptureTriggerPayload>; @@ if (!parsed.output) { throw new Error("payload.output is required"); } if (!parsed.console_url) { throw new Error("payload.console_url is required"); } + const url = new URL(parsed.console_url); + if ( + url.origin !== "https://console.anthropic.com" || + !url.pathname.startsWith("/settings/memory-stores") + ) { + throw new Error("payload.console_url must target Anthropic Console memory stores"); + } + const normalizedOutput = parsed.output.replaceAll("\\", "/"); + if (!normalizedOutput.startsWith("assets/memory-stores-screenshots/")) { + throw new Error("payload.output must stay under assets/memory-stores-screenshots/"); + } return { event: parsed.event, substrate: parsed.substrate, week: parsed.week, - output: parsed.output, - console_url: parsed.console_url, + output: normalizedOutput, + console_url: parsed.console_url, }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/capture-mem-stores.ts` around lines 61 - 84, The parsePayload function currently accepts any console_url and output; validate that parsed.console_url exactly matches (or matches a strict regexp for) the Anthropic memory-stores page URL used by the app (e.g., host and path for the memory-stores console) and throw an error if it does not, and validate parsed.output to ensure it is inside the allowed screenshots directory (no absolute paths, no .. segments, and only allow safe filenames/extensions such as alphanumeric + dashes/underscores with .png/.jpg); update parsePayload to perform these checks and return only when both validations pass, otherwise throw descriptive errors.
🧹 Nitpick comments (4)
scripts/__tests__/validate-agents.test.ts (1)
39-42: Makeseo-criticfixture selection deterministic.Basename lookup can become ambiguous if another
seo-critic.jsonis added later (e.g., archive/mirror). Prefer targeting the canonical production path directly.Proposed change
- const seoCriticPath = agentFiles.find((f) => f.endsWith("seo-critic.json")); - if (!seoCriticPath) { - throw new Error("seo-critic.json not found under agents/"); - } + const seoCriticPath = join(agentsDir, "production", "seo-critic.json"); + if (!agentFiles.includes(seoCriticPath)) { + throw new Error("agents/production/seo-critic.json not found under agents/"); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/__tests__/validate-agents.test.ts` around lines 39 - 42, The current selection uses agentFiles.find((f) => f.endsWith("seo-critic.json")) which is ambiguous if multiple files share that basename; change the lookup to target the canonical production path explicitly (for example look for the full relative path that includes the agents/seo-critic directory) so seoCriticPath is found deterministically instead of by basename.demo-landing-page/ugly/style.css (1)
18-20: Respect reduced-motion preferences for smooth scrolling.Add a reduced-motion override so users who disable motion aren’t forced into animated scroll behavior.
Proposed fix
html { scroll-behavior: smooth; } + +@media (prefers-reduced-motion: reduce) { + html { + scroll-behavior: auto; + } +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo-landing-page/ugly/style.css` around lines 18 - 20, The current CSS rule forces smooth scrolling on the html element via the scroll-behavior property; add a prefers-reduced-motion media-query override so users who opt out of motion get non-animated scrolling (override html { scroll-behavior: smooth; } with html { scroll-behavior: auto; } inside a `@media` (prefers-reduced-motion: reduce) block).scripts/__tests__/sim-council.test.ts (1)
74-82: Avoid VCS-state assertions in unit tests.Line 74–82 makes the suite depend on git working-tree state, not sim-council behavior. That is brittle in non-repo test environments and will fail on legitimate future edits to
prompts/second-wbs-session.md.Proposed change
- test("does not modify production weekly orchestrator", () => { - const diff = Bun.spawnSync( - ["git", "diff", "--name-only", "--", "prompts/second-wbs-session.md"], - { - stdout: "pipe", - }, - ); - expect(new TextDecoder().decode(diff.stdout).trim()).toBe(""); - });Move this guard to CI policy (PR-level check), and keep this file focused on
prompts/sim-council.mdbehavior/invariants.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/__tests__/sim-council.test.ts` around lines 74 - 82, The test "does not modify production weekly orchestrator" in scripts/__tests__/sim-council.test.ts relies on VCS state via Bun.spawnSync(["git","diff",...]) and should be removed from the unit test suite; delete the block that shells out to git (the test starting with test("does not modify production weekly orchestrator", ...)) and instead enforce the guard as a CI/PR-level check (move the git-diff assertion into your CI pipeline or a separate pre-merge script), keeping this test file focused only on sim-council behavior and invariants (e.g., tests around prompts/sim-council.md).demo-output/landing-page/agents.json (1)
1-1: Reduce tool privileges forlocal-lp-redesigner.
Bashenables arbitrary shell execution, while this role’s stated workflow is primarily file reads/writes/edits. DroppingBash(unless there is a hard requirement) lowers prompt-injection and accidental command risk.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo-output/landing-page/agents.json` at line 1, The agent spec for local-lp-redesigner grants an unnecessary high-risk tool ("Bash"); remove "Bash" from the tools array in the local-lp-redesigner agent definition (the object keyed "local-lp-redesigner") so it only lists safe operations ("Read","Write","Edit","MultiEdit") unless there's an explicit, documented need for shell access; after editing the tools array, run any schema/validation checks or CI that validate agents.json to ensure the file stays well-formed and that local-lp-redesigner still meets workflow requirements.
| export async function processBridgeInput(input: string, deps: BridgeDeps = {}): Promise<void> { | ||
| const spawnCapture = deps.spawnCapture ?? defaultSpawnCapture; | ||
| const writeStdout = deps.writeStdout ?? ((line: string) => process.stdout.write(line)); | ||
| const lines = input.split(/(?<=\n)/).filter((line) => line.length > 0); | ||
| for (const rawLine of lines) { | ||
| writeStdout(rawLine); | ||
| const line = rawLine.endsWith("\n") ? rawLine.slice(0, -1) : rawLine; | ||
| const trigger = parseCaptureTriggerLine(line); | ||
| if (trigger) { | ||
| await spawnCapture(trigger); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| async function readStdin(): Promise<string> { | ||
| const chunks: Uint8Array[] = []; | ||
| for await (const chunk of Bun.stdin.stream()) { | ||
| chunks.push(chunk); | ||
| } | ||
| const total = chunks.reduce((sum, chunk) => sum + chunk.length, 0); | ||
| const merged = new Uint8Array(total); | ||
| let offset = 0; | ||
| for (const chunk of chunks) { | ||
| merged.set(chunk, offset); | ||
| offset += chunk.length; | ||
| } | ||
| return new TextDecoder().decode(merged); | ||
| } | ||
|
|
||
| if (import.meta.main) { | ||
| try { | ||
| await processBridgeInput(await readStdin()); |
There was a problem hiding this comment.
Process stdin incrementally instead of buffering to EOF.
This bridge doesn't forward lines or fire captures until readStdin() finishes, so a week-1 or week-5 trigger will be handled after the whole simulation exits. That defeats the point of a mid-run snapshot and can capture the final store state three times instead of the state at each trigger.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/sim-capture-bridge.ts` around lines 50 - 81, The code currently
buffers the entire stdin via readStdin() and only calls processBridgeInput after
EOF; change the import.meta.main path to consume Bun.stdin.stream()
incrementally and handle lines as they arrive: inside the main async block read
chunks from Bun.stdin.stream(), append to a string buffer, split out complete
newline-terminated lines, and for each complete line call the same logic used in
processBridgeInput (writeStdout/defaultWrite and parseCaptureTriggerLine + await
spawnCapture or defaultSpawnCapture) so triggers fire mid-run; keep any leftover
partial line in the buffer until more data arrives and avoid awaiting EOF.
Ensure you reuse parseCaptureTriggerLine, spawnCapture (or defaultSpawnCapture)
and writeStdout/defaults to preserve behavior.
* chore(final-polish): remove video footprint + polish context docs Remove submission-tooling that does not belong in the public repo: skills/webster-video/, video/, context/webster-video/, the video composition prompt, the onboarding case-study, and the webster-onboarding empire fixture. The HyperFrames render pipeline that produced the timelapse was always tooling-side, not product. Polish ARCHITECTURE / FEATURES / DOMAIN-MODEL / VISION / QUALITY-GATES to the post-eeda2bc shipped state: 9 production Managed Agents mirrored 1:1 by 9 simulation specs. Reframe Layer 6 (video) as external submission tooling, not a blocked product layer. Move visual-design-critic to its true provenance — a W4 genealogy spawn, not a permanent L2 base agent. Drop deferred / hackathon-crunch language from the canon docs; that posture is over. Strip unverifiable projections from README (cost-per-month, cost-per-run, agency-pricing comparisons). Standardize the test-count phrasing to "29 test files green via bun run validate". Tidy dangling references in skills/webster-onboarding and qa-bank. Validate is green: 176 tests pass, 0 lint warnings, 0 type errors, 0 markdown errors, 19 JSON specs valid, 7 findings files valid. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(final-polish): drop watch-dispatches.sh dead Forge watcher Forge dispatch watcher with no live consumer post-hackathon. Only references were in checkpoints/compactions, no callers in scripts/, prompts/, package.json, or workflows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(final-polish): add judge tour + per-week INDEX Add a "5-minute judge tour" section to README naming the exact click-path for evaluation: pitch → INDEX → one week's visual review → critic-genealogy.ts → optional live --dry-run. Add demo-output/landing-page/INDEX.md narrating the 11-week LP timelapse, one beat per week with classification + links to that week's screenshots and visual review. Turns scattered week directories into a scannable evidence path. Drop the "rendered MP4 hosted externally — link in submission form" references from README, ARCHITECTURE, and FEATURES. The video link is owned by the submission form, not the repo, and pointing at a form judges may not have open is dead weight. The per-week assets plus INDEX.md carry the visual evidence on their own. Validate green: 176 tests pass, 0 lint warnings, 0 type errors, 0 markdown errors, 19 specs valid, 7 findings files valid. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Single rollup PR for the Built-with-Opus-4.7 hackathon submission. 42 commits worth of expansion work going from
dev→mainas one atomic block (per AGENTS.md: "the expansion lands atomically so production stays coherent").What's in this rollup
Today's 5 PRs (all merged to
devalready):skills/webster-video/HyperFrames timelapse render pipeline +demo-output/11-week LP simulation assets (166 files, all adds)agents/reorganization:production/(9 specs) +simulation/(9 specs, 1:1 mirror), site-sim substrate dropped, README/AGENTS/VISION aligned to single-substrate Richer Health LP demoskills/webster-onboarding/v2 with verify scripts, repo polish (untracked internal tracking docs + stale Ralph PRDs + personal launchd plist),wbsalias setup section in READMEskills/webster-weekly-council/library skill that routes operators to/webster-weekly-councilas the slash-command equivalent of the lockedprompts/second-wbs-session.mdrunbookEarlier dev work (PRs #1–#8, already merged to dev): genealogy governance layers, Pair Alpha secondary substrate seeding, planner agent spec, apply-worker CLI, memory substrate, simulation council runner, demo arc seeders.
Production safety check
agents/production/(planner + 5 critics + visual-reviewer + monitor + redesigner) are untouched in their behavior — only their on-disk path moved.prompts/second-wbs-session.mdis unchanged. Nicolette's live weekly council still runs the same orchestration.agents/webster-visual-design-critic.jsonwas orphaned (dropped from production) as a deliberate symmetry call with the LP-sim cluster. Not registered against the live API for the production environment.CI
Test plan
bun run validateonmainlocallyprompts/second-wbs-session.mdandagents/production/*.jsonare byte-identical to pre-rollupmain(no behavior drift)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Tests
Chores
.gitignoreand linting configurations for demo outputs.playwrightdependency and new simulation/validation scripts topackage.json.