Skip to content

feat: hackathon expansion — sim council, skills, video, repo polish#15

Merged
richsak merged 47 commits intomainfrom
dev
Apr 27, 2026
Merged

feat: hackathon expansion — sim council, skills, video, repo polish#15
richsak merged 47 commits intomainfrom
dev

Conversation

@richsak
Copy link
Copy Markdown
Owner

@richsak richsak commented Apr 27, 2026

Summary

Single rollup PR for the Built-with-Opus-4.7 hackathon submission. 42 commits worth of expansion work going from devmain as one atomic block (per AGENTS.md: "the expansion lands atomically so production stays coherent").

What's in this rollup

Today's 5 PRs (all merged to dev already):

Earlier dev work (PRs #1#8, already merged to dev): genealogy governance layers, Pair Alpha secondary substrate seeding, planner agent spec, apply-worker CLI, memory substrate, simulation council runner, demo arc seeders.

Production safety check

  • The 9 production agent specs in agents/production/ (planner + 5 critics + visual-reviewer + monitor + redesigner) are untouched in their behavior — only their on-disk path moved.
  • prompts/second-wbs-session.md is unchanged. Nicolette's live weekly council still runs the same orchestration.
  • agents/webster-visual-design-critic.json was orphaned (dropped from production) as a deliberate symmetry call with the LP-sim cluster. Not registered against the live API for the production environment.

CI

Test plan

  • CI passes on this PR
  • After merge, run bun run validate on main locally
  • Spot-check that prompts/second-wbs-session.md and agents/production/*.json are byte-identical to pre-rollup main (no behavior drift)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added 9 simulation agent specs for LP workflow testing and demo automation.
    • Introduced memory store provisioning and management infrastructure.
    • Added comprehensive 11-week demo output with analytics, visual reviews, and heatmap artifacts.
    • Added simulation entrypoint commands and environment verification utilities.
  • Documentation

    • Updated operational runbooks and agent invocation patterns.
    • Updated README and architecture docs to reflect 9-agent council structure.
    • Added extensive demo walkthrough documentation with week-by-week visual reviews.
  • Tests

    • Added comprehensive test coverage for simulation, agent registration, and demo manifest generation.
  • Chores

    • Updated .gitignore and linting configurations for demo outputs.
    • Added playwright dependency and new simulation/validation scripts to package.json.
    • Removed obsolete task documentation and PRD files.

richsak and others added 30 commits April 24, 2026 19:40
- Fix dev/ → dev (trailing slash invalid in git branch names)
- Add explicit Worktree + PR flow section: base off dev, PR to dev,
  merge to dev. Forge task branches + Claude Code feat/T<n>-<slug>
  pattern documented.
- Hackathon rollup procedure: dev → main is a single atomic PR after
  T10 completes; production stays coherent until then.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New top-level section in EXPANSION-TASKS.md tells a fresh session to:
- Read AGENTS.md first-actions list in full before coding
- Start T0 immediately without confirmation
- Stop after T0 (validate green + committed), report 3-5 lines to
  Richie, wait for green-light on T1
- From T1 onward: proceed without approval BUT post a 2-line
  "Starting T<n>" announcement with files-touched list before
  each task, so Richie has visibility to interrupt without blocking

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
richsak and others added 15 commits April 24, 2026 22:47
* fix(ci): install ImageMagick + Playwright Chromium in test job

The validate job's "Run tests" step was failing on dev with two
missing-binary errors:

- `magick` not in $PATH — visual-assets / heatmap rendering shells out
  to ImageMagick. Ubuntu runners do not include it by default.
- Playwright Chromium executable missing — browser-audit tests launch
  Chromium via Playwright. The bundled `playwright` package needs an
  explicit `playwright install` to fetch the headless browser.

Adds both as steps before "Run tests" so the dev/main baseline goes
green again and downstream PRs stop inheriting the red.

* fix(ci): symlink magick -> convert for ImageMagick v6 compat

Ubuntu's `imagemagick` package ships v6 binaries (convert, mogrify),
not v7's unified `magick` entrypoint. scripts/build-demo-manifest.ts
calls `execFileSync("magick", ...)` so the v7 name is required.

Symlink `magick` -> `$(which convert)` after the apt install. The
chained-argv syntax we use is v6-compatible.

* fix(ci): install ImageMagick 7 portable binary instead of apt v6

The apt imagemagick package provides v6-only binaries. The repo uses
v7 multi-call dispatch (magick, magick identify, ...) which v6 cannot
satisfy via convert symlinks. Pull the v7 portable binary from the
official ImageMagick release and drop it into /usr/local/bin.

* fix(ci): use v6 + magick dispatcher wrapper instead of v7 download

Portable ImageMagick 7 binary URL was 404 — no canonical 'magick' static
release exists at imagemagick.org. Switch back to apt's v6 install and
add a small bash dispatcher at /usr/local/bin/magick that forwards to
the v6 binary based on the subcommand. Handles 'magick (...)',
'magick identify ...', and the other tools the repo uses.
* chore(video): add tracking dir for webster-video skill

CONTEXT.md is compaction-resistant (mission, locked decisions, critical
paths, current phase, don't-drift invariants, polish-slot index). Single
Read call restores alignment after autocompact or fresh session.
STATUS.md is the running tracker: phase, latest action, blockers, render
history, day log.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(video): hydrate 11 weeks of LP demo assets + add hydrate script

Day 0 of the webster-video skill plan. Copies weekly council-simulation
artifacts from local-runs/lp-council/w01-single-offer-visual-heatmap/
(simulation working copy) into demo-output/landing-page/wNN/ (committed
deliverable, the canonical handoff per T7/T9 task graph).

Per-week (w00-w10): desktop/mobile/tablet PNGs + matching heatmap SVGs +
analytics.json + heatmap.json + visual-review.md (w00 omits since it is
the pre-council baseline). Plus brand.json, agents.json, manifest.json
with hash + week list at the run root.

Also gitignores local-runs/ (now that the durable copy is in
demo-output/) and audio/*.raw.mp3 (only the Auphonic-leveled narration.mp3
is committed).

Unblocks remote planning surfaces (Ultraplan, fresh clones) which only
see committed files; the timelapse story now travels with the repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(video): scaffold HyperFrames project at video/

Installed @heygen-com/hyperframes skill bundle (5 sub-skills: gsap,
hyperframes, hyperframes-cli, hyperframes-registry, website-to-hyperframes)
into .agents/skills/, symlinked into .claude/skills/ for Claude Code.

Ran hyperframes init video --example blank to scaffold the project root
with index.html (1920x1080 master composition, GSAP timeline registered on
window.__timelines["main"]), hyperframes.json config, and the bundled
AGENTS.md / CLAUDE.md guides.

Pre-flight verified: node v25.9.0, ffmpeg 8.0.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(video): structural draft — 7 scenes + master composition

Day 1 baseline of the webster-video skill plan. The structural draft
renders the full 130-second story end-to-end via HyperFrames sub-compositions,
with plain-but-working baselines that Claude Design will polish in Phase B.

Layout:
- video/index.html — master composition (1920x1080 at 130s) sequencing the 7
  sub-compositions via data-composition-src
- video/compositions/{title-card,before-state,transformation,learning-beat,
  recovery-arc,final-state,end-card}.html — one HyperFrames sub-composition
  per storyboard beat. Each has data-composition-id, scoped GSAP timeline
  registered on window.__timelines, and class="clip" elements with
  data-start/data-duration/data-track-index per the framework contract.
- video/shared.css — brand tokens (deep teal, warm cream, leaf green,
  charcoal, soft gold) + typography (Cormorant Garamond, Inter, IBM Plex
  Sans Condensed) + 5 reusable component classes (brand-title,
  synthetic-disclaimer, stat-counter, heatmap-overlay, council-ring) +
  scene primitives. Single source of truth for visual primitives;
  Claude Design will polish individual classes per the slot contract.
- video/data/{metrics,brand,council}.json — typed copies of the per-week CSV
  from prompts/video-composition-session.md, brand tokens from
  demo-output/landing-page/brand.json, and the 10-agent council roster.
- video/lib/{easings,trim-points}.js — shared GSAP easings and the 90s
  social-cut frame-range definitions for the trim-points polish slot.
- video/script.md — 130s narration with beat-by-beat visual cues, callouts,
  voice direction, and an honesty checklist tying every claim back to the
  synthetic-data invariant.
- video/assets — symlink to ../demo-output/landing-page so HyperFrames
  serves committed weekly artifacts via static paths.

Validation:
- npx hyperframes lint: 0 errors, 0 warnings across all 8 files
- npx hyperframes inspect: 0 layout issues across 9 timeline samples

Also gitignores the AI-platform symlinks installed by `skills add` (parallel
agent installs we don't use) and treats .agents/skills/ + .claude/skills/
like node_modules (re-installable via npx skills add heygen-com/hyperframes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(video): update STATUS — Day 1 draft rendered; CONFIRMATION GATE

All 7 HyperFrames sub-compositions render with brand-aligned visuals:
0 lint errors, 0 layout issues, 7/7 scene snapshots passed visual check.
video/snapshots/ and video/renders/ added to .gitignore (regenerable).
Awaiting Richie's confirmation before Day 2 (audio chain, slot packets,
Claude Design polish, final render).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(video): polish round — type tightening, layered shadows, motion choreography

shared.css: layered card shadows via --brand-shadow-card, tightened brand-title
letter-spacing, refined synthetic-disclaimer with left-edge teal accent + soft
drop shadow, stat-counter value depth via text-shadow + tabular-nums, thicker
callout-chip accent border.

7 scene timelines: mask-reveal entrances on title-card + end-card via
clipPath inset; overshoot-stagger callouts on before/transformation/learning/
recovery/final via back.out(1.4); scale-on-entry for w04 + w09; PASS chip
stamps with back.out(1.7). Counter ramp on transformation kept as the single
metric story arc.

No HTML structure changes — data-* attributes, .clip class, and
window.__timelines registration preserved per HyperFrames contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(video): silent render + mux-narration script

Silent render: demo-output/videos/webster-lp-demo.silent.mp4
130s · 1920x1080 · 30fps · h264 · ~10 MB · rendered with hyperframes
-q high --strict in ~58s wall with 6 parallel workers.

mux-narration.ts: replaces planned auphonic-process.ts with ffmpeg
loudnorm (-16 LUFS) + bandpass + AAC mux. Auto-detects audio/narration.raw.{wav,mp3,m4a}, levels to audio/narration.mp3, muxes
silent mp4 + leveled audio into demo-output/videos/webster-lp-demo.mp4.
--skip-level flag for users who pre-level externally (Auphonic).
Uses execFileSync (no shell concat) per security hook.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(video): rebuild scenes — paired scrolling LPs + Data/Council/Outcome panel

Per user feedback after first silent render:
- Drop mobile screenshots; desktop only.
- Show full LP scrolling top-to-bottom over each scene's duration.
- Center the LP up top, place analytics + narrative below.
- Show week evolution by pairing last week | current week.
- Pinpoint what data said and how the council decided to react.

shared.css rev 2: drop screenshot-card, week-chip, callout-chip, heatmap-*,
council-ring*, stat-counter*. Add lp-pair-stage / lp-week-block / lp-week-label
(+ chip current/dim/final variants) / lp-card (860x600, overflow hidden) /
lp-card--dim (saturate 0.5 brightness 0.94) / lp-image (absolute, GSAP
translateY for scroll). Add narrative-panel / narrative-col (+ decision/
outcome variants) / narrative-eyebrow / narrative-body / narrative-stat
(+ --small variant for two-value displays) / narrative-stat__delta.

Pair scheme:
- before-state w00 solo (baseline narrative)
- transformation w01 | w02 (council's first transformation)
- learning-beat w03 | w04 (experiment + correction)
- recovery-arc w08 dim | w09 (failure + recovery)
- final-state w00 dim | w10 (full-arc bookend recap)

Scroll distances hardcoded per week from native 1440-wide screenshot heights:
526 / 2162 / 3362 / 3414 / 3727 / 3671 / 3599 / 3587 px. Linear ease across
scene minus 2s entry buffer. w00 short height creates intentional slow scroll
versus dense w10 fast scroll — visual story of "the page grew rich over 11
weeks". Counter ramps preserved on transformation (151→343) and final-state
(151→323).

title-card and end-card unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(video): build Claude Design polish bundle (7 slots, 1.3 MB)

Single-shot polish workflow: prepare 7 self-contained slot packets,
upload as zip to claude.ai/design, receive polished bundles, integrate
back via apply-polish-bundle.ts.

build-slot-packets.ts: generates skills/webster-video/polish-slots/<slot>/
for each of the 7 scenes (title-card, before-state, transformation,
learning-beat, recovery-arc, final-state, end-card). Per slot:
  - slot.json (master frame range + fps/dimensions)
  - brand.tokens.json (palette, typography, motion, honesty constraints,
    HyperFrames contract reminder)
  - baseline.html (current scene)
  - baseline.css (full shared.css for context)
  - baseline.png (mid-scene snapshot from video/snapshots/)
  - acceptance.md (measurable "done" criteria + visual goals + "what can
    change")
  - DO_NOT_TOUCH.md (locked text, numbers, durations, scroll distances,
    honesty framing, HyperFrames contract)
Plus polish-slots/README.md (operator workflow) and PROMPT.md (system
prompt to paste into claude.ai/design).

apply-polish-bundle.ts: reads polish-slots/<slot>/handoff/*.html (full
replacement) and optional handoff-shared/shared.css. Runs hyperframes
lint after, prints next steps. Does not auto-commit — review via
git diff. Uses execFileSync (no shell concat) per security hook.

.gitignore: added skills/webster-video/polish-slots/**/handoff/ and
handoff-shared/ and polish-slots.zip. Baseline slot files are committed;
polished handoff bundles stay local until reviewed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(video): free up Claude Design brief — only constraint is 150s ceiling

User pivot: rather than locking text, durations, honesty framing, and
HyperFrames-contract per scene, give Claude Design full autonomy with one
absolute constraint — total video ≤ 2.5 min (150s).

Changes:
- PROMPT.md rewritten as a free-form brief. Layout, scene count, scene
  durations, copy, typography, motion language, color treatment — all
  up to Claude Design.
- README.md slimmed to match.
- 14 per-slot constraint files removed (acceptance.md + DO_NOT_TOUCH.md
  for each of 7 slots).
- brand.tokens.json stripped: palette + typography + motion easings +
  card shadow only. No voice / design_direction / constraints / honesty
  framing / HyperFrames-contract reminder text.
- slot.json field rename: master_*_seconds → current_master_*_seconds
  (signals these are starting context, not specs).
- build-slot-packets.ts: added cleanStaleFiles() so future regenerations
  remove acceptance.md / DO_NOT_TOUCH.md if reintroduced. Script body
  shrank from 518 lines to ~200.
- polish-slots.zip regenerated (1.3 MB).

Net: -1303 / +150 lines. Bundle is now consistently free.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(video): prettier formatting on polish-bundle artifacts

No semantic changes — prettier expanded multi-line template literals in
build-slot-packets.ts and trimmed trailing whitespace in STATUS.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(video): surface demo-arc artifacts for judges; gitignore rendered mp4

Adds a "Demo arc artifacts" line under README "## For judges" pointing at the
11-week simulation council deliverables under demo-output/landing-page/, the
Managed Agents memory-store screenshot, and the local reproduce command.

Adds demo-output/videos/ to .gitignore. The rendered timelapse mp4 is hosted
externally for the hackathon submission rather than committed (avoids 70+ MB
binary in git history; the rendering pipeline + committed substrate already
proves reproducibility).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(video): silence lint — global window comment + interface defs

- video/lib/easings.js, trim-points.js: add /* global window */
  (flat-config replacement for the deprecated /* eslint-env browser */).
- skills/webster-video/scripts/hydrate-demo-assets.ts: convert two
  type-alias decls to interface form to satisfy
  @typescript-eslint/consistent-type-definitions.

Closes the 6 lint errors blocking PR #13's CI.

* fix(video): markdown lint — ignore agent outputs, fix manual doc nits

- .markdownlint-cli2.jsonc: add demo-output/ and video/assets/ to
  ignores. Both contain auto-generated visual-review.md files (same
  category as history/, already ignored).
- Auto-fix MD031/MD032/MD026/MD034 in video/AGENTS.md, video/CLAUDE.md,
  skills/webster-video/polish-slots/PROMPT.md.
- Add explicit language tag (`text` / `bash`) to two unfenced code
  blocks in video/CLAUDE.md and skills/webster-video/polish-slots/README.md
  to satisfy MD040.

Closes the 27 markdown lint errors blocking PR #13's CI.

* fix(test): bump browser-audit timeout to 30s for CI Chromium cold start

The Playwright screenshot test was timing out at bun:test's default 5s.
First Chromium launch in clean CI is 5–10s. The sibling test in
run-simulation.test.ts that exercises the same code path took 4.9s
on this run. Bumping to 30s gives margin without slowing happy-path.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Trims Webster's agent surface to a single substrate ahead of the
hackathon submission. Site-sim was committed but unused by the demo
video. visual-design-critic was a runtime genealogy spawn with no
id.txt registration and no production orchestrator references, so
removing it brings production back to a clean 9-agent set that
mirrors the 9-spec LP-sim set 1:1.

Deletions: 9 site-sim agent specs, demo-sites/northwest-reno/,
scripts/run-simulation-site.ts, agents/visual-design-critic.json,
package.json sim:site, sim-agents.json + memory-stores.json site
halves. run-simulation.ts collapses to lp-only literal types and a
single-page screenshot path. register-sim-agents.ts/sim-preflight.ts
expectations drop from 18→9 sim specs, 2→1 substrate. sim-council.md
fan-out drops the SUBSTRATE/AGENT_SET site case and the
licensing-and-warranty-critic arm. Affected tests rewritten to
LP-only assertions; no tests deleted outright.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Splits the flat agents/*.json layout into two subdirectories so the
operational role of each spec is obvious from the path:
- agents/production/ — 9 specs that run Nicolette's live council
- agents/simulation/ — 9 specs that drive the LP timelapse demo

The two sets are now 1:1 symmetric. validate-agents.ts and the
schema test recurse through agents/** so the strict Ajv schema gate
is preserved without any weakening. Hardcoded paths in
critic-genealogy.ts, planner-invoke.ts, and the affected tests are
updated to the subdir layout. Genealogy spawns now write into
agents/production/ so runtime-spawned critics land beside the rest
of the production set. agents/AGENTS.md updated to describe the
new shape; agents/CLAUDE.md follows via the existing symlink.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Updates the judge-facing surface and the canonical north-star docs
to match the new single-substrate, 9-agent shape:

- README pitch (7→9 agents), architecture diagram now shows the
  full 7-session fan-out (5 critics + monitor + visual-reviewer),
  repo layout reflects agents/{production,simulation}/, submission
  notes report 9 specs and 175 tests
- AGENTS.md mission says single-substrate Richer Health LP demo
  and points the do-not-touch rule at agents/production/
- context/VISION.md drops the dual-substrate framing: the demo arc
  is LP-only, memory store count is 6 not 12, the API cost note
  scales to one substrate, and the locked/out-of-scope sections
  reflect the cut

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
refactor: trim agents to production/ + simulation/, drop site-sim
Section A polish — pre-judging cleanup. Moved internal-only docs to
~/Vault/Projects/webster/internal-tracking/ (preserved locally, removed
from public tree).

Untracked from context/:
  EXPANSION-TASKS.md, E2E-IMPLEMENTATION-TRACKER.md, SITE-FORK-CHECKLIST.md,
  ROADMAP.md, VIDEO-PLAN.md, VIDEO-PLAN-90s.md, v2-design.md

Untracked from prompts/ (only first/second-wbs and sim-council stay public,
matching what README documents):
  third-wbs-session.md, fourth-wbs-session.md, sim-audit-fix-session.md,
  composition-session.md, e2e-demo-run-session.md, sim-runner.md

Untracked from history/:
  AGENTS.md (+ CLAUDE.md symlink)

Untracked .forge/ralph/ — already gitignored for new dispatches; old PRDs
were still in the index. Forge workflow YAMLs and config remain tracked.

AGENTS.md updated to redirect EXPANSION-TASKS references to the vault path
and drop stale doc references.
Implements the P0–P5 phase model locked in context/ONBOARDING-CASE-STUDY.md
(Q1–Q15). Skill is a thin shell over scripts/onboarding/* and the runtime
registration patterns in prompts/first-wbs-session.md, with status-file
resume and machine-checked phase exit gates.

- skills/webster-onboarding/SKILL.md — orchestration, P0 overview through
  P5 first council, ! pre-load checks, hard rules on key handling
- references/{qa-bank,business-yaml-schema,key-handling,remediation,
  empire-fixture}.md — detail split per skill-authoring conventions
- scripts/onboarding/verify-env.ts — reads .env.local, hits Anthropic /
  GitHub / Cloudflare verify endpoints without echoing key values
- scripts/onboarding/verify-all.ts — rollup with --phase {p3,p4}, agent
  count derived dynamically from agents/*.json non-sim specs
- scripts/onboarding/scaffold-repo.ts — creates GitHub repo + Astro
  starter using brand identity from context/business.yaml
- package.json — wire onboarding:verify-env, :verify-all, :scaffold-repo
- context/ONBOARDING-CASE-STUDY.md — fix 9-vs-10 production agent drift

Smoke-tested verify-env (exit 1 on missing .env.local with clean hint)
and verify-all --phase p3 (per-check ✓/✗, exit 1 on real env gaps).
End-to-end skill invocation and scaffold-repo (side-effecting) untested
— follow up via Empire dry-run before recording.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README: add a "The `wbs` alias" section under Prerequisites so judges
  who clone the repo can replicate the dispatcher launcher (or run the
  equivalent `claude --settings ...` directly without aliasing).
- Untrack `deploy/webster-dispatcher.plist`. The launchd plist hardcodes
  Richie's macOS user paths (`/Users/richiesakhon/...`) so it cannot be
  shipped publicly. Preserved in `~/Vault/Projects/webster/internal-tracking/deploy/`.
- Run `bun run format` to fix prettier table-padding in
  `context/ONBOARDING-CASE-STUDY.md` (PR #12 CI format-check failure).
feat(skill): webster-onboarding v2 with verify scripts
Library-shape conversion of prompts/second-wbs-session.md so operators run
the weekly council pass via /webster-weekly-council. SKILL.md is a slim
phase index; nine references/ files hold per-phase bash blocks loaded on
demand; two skill-local scripts/ extract the reusable polling helper and
the planner-JSON parser. The 662-line single-page prompt remains intact
as the locked source-of-truth runbook.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update README, AGENTS, context/ARCHITECTURE, context/FEATURES, and
history/AGENTS so the weekly-run operator surface is the new library
skill (slash-command form) with prompts/second-wbs-session.md framed as
the locked single-page runbook fallback. The prompt itself is unchanged
and still locked by scripts/__tests__/sim-council.test.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat: webster-weekly-council library skill (route operators to /webster-weekly-council)
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 27, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 39b7c309-a1ca-4a74-a714-f5d9f98b58a1

📥 Commits

Reviewing files that changed from the base of the PR and between 38afb67 and 4a35def.

📒 Files selected for processing (13)
  • .gitignore
  • AGENTS.md
  • README.md
  • context/ARCHITECTURE.md
  • context/DOMAIN-MODEL.md
  • context/FEATURES.md
  • context/QUALITY-GATES.md
  • context/VISION.md
  • demo-output/landing-page/INDEX.md
  • prompts/video-composition-session.md
  • scripts/watch-dispatches.sh
  • skills/webster-onboarding/SKILL.md
  • skills/webster-onboarding/references/qa-bank.md

📝 Walkthrough

Walkthrough

This pull request consolidates a major architectural shift for the Webster project: it removes product requirements for multiple features (apply-worker CLI, genealogy governance, orchestrator memory planner, agent specs, and demo seeding), establishes a production roster of 9 managed agents mirrored by 9 simulation agents, introduces comprehensive simulation and onboarding infrastructure, and commits a complete 11-week demo landing page with analytics, heatmaps, and visual reviews. It also updates CI tooling and eliminates task planning documentation in favor of a shipped artifact-focused state.

Changes

Cohort / File(s) Summary
Feature PRD Deletions
.forge/ralph/*/prd.json, .forge/ralph/*/prd.md, .forge/ralph/*/progress.txt
Removes all requirements documentation for apply-worker-cli-v5, genealogy-gov-v1, orch-memory-planner-v1, planner-agent-spec-v5, seed-demo-arc-w3w4-v5, and webster-feature-58. No replacement specifications are provided.
Production Agent Specs
agents/production/... (references in tests), agents/visual-design-critic.json deleted
Shifts agent registry from flat directory to agents/production/ subdirectory; retires visual-design-critic; updates test paths accordingly.
Simulation Agent Roster
agents/simulation/webster-lp-sim-*.json
Introduces 9 mirrored simulation agents for LP substrate: monitor, planner, redesigner, visual-reviewer, and 5 critic agents (brand-voice, conversion, copy, fh-compliance, seo), each with tailored system prompts and MCP tooling.
Demo Landing Page Context
demo-landing-page/context/brand.json, demo-landing-page/context/business.md, demo-landing-page/context/personas.json
Defines Richer Health brand, business identity, and three audience personas as structured configuration for simulation agents to reference.
Demo Landing Page HTML/CSS
demo-landing-page/ugly/index.html, demo-landing-page/ugly/style.css, demo-landing-page/ugly/README.md
Introduces the intentionally unimproved baseline landing page (week 0) with clinical positioning, form, and responsive design serving as the sim starting point.
Demo Output Artifacts (Weeks 0–10)
demo-output/landing-page/w00.../w10/ subdirectories
Commits 11 weeks of synthetic simulation outputs: analytics.json, analytics-reasoning.md, heatmap.json, visual-review.md, and optional screenshots/ artifacts for each week demonstrating iterative improvement.
Simulation Scripts
scripts/run-simulation.ts, scripts/run-simulation-lp.ts, scripts/sim-council.md, scripts/sim-capture-bridge.ts, scripts/sim-preflight.ts
Adds core simulation orchestration: weekly config management, council session invocation, memory store capture bridging, and environment preflight validation.
Memory and Agent Registration
scripts/provision-memory-stores.ts, scripts/register-sim-agents.ts, context/memory-stores.json, context/sim-agents.json
Introduces Managed Memory Store provisioning, simulation agent registration, and corresponding JSON manifests mapping substrate+role to store/agent IDs.
Context and Manifest Builders
scripts/build-demo-manifest.ts, scripts/emit-memory-screenshot-manifest.ts, scripts/context-schema.ts, scripts/run-markdown-bash.ts
Adds schema validation, manifest assembly, screenshot aggregation, and markdown-to-bash execution helpers supporting demo/simulation workflows.
Onboarding Scripts
scripts/onboarding/verify-env.ts, scripts/onboarding/verify-all.ts, scripts/onboarding/scaffold-repo.ts
Introduces environment verification (API keys, auth tokens), repository readiness checks (agents registered, memory stores provisioned), and automated repo scaffold for new instances.
Browser Audit and Capture
scripts/browser-audit.ts, scripts/capture-mem-stores.ts (updates/additions)
Expands Playwright fallback logic and adds memory store screenshot capture via browser-use, with auth-expired and evidence-validation checks.
Test Coverage
scripts/__tests__/*.test.ts (11 new/updated files)
Adds comprehensive Bun tests for anthropic agents pagination, browser-audit fallback, simulation execution, memory provisioning, agent registration, context schema validation, and CLI entrypoints.
Documentation and Plan Deletions
context/EXPANSION-TASKS.md, context/ROADMAP.md, context/v2-design.md, prompts/fourth-wbs-session.md, prompts/third-wbs-session.md, prompts/video-composition-session.md, history/AGENTS.md, deploy/webster-dispatcher.plist
Removes task planning, roadmap tracking, apply/review design docs, and operator runbooks; eliminates LaunchAgent dispatcher and history governance rules.
Architecture and Vision Updates
context/ARCHITECTURE.md, context/VISION.md, context/DOMAIN-MODEL.md, README.md, AGENTS.md, agents/AGENTS.md
Refocuses narratives from 7-agent/dual-substrate to 9-production + 9-sim roster, single LP substrate, planner/visual-reviewer additions, and Managed Memory Stores as state layer.
Feature Status Consolidation
context/FEATURES.md, context/QUALITY-GATES.md
Updates feature status from task-driven to shipped-inventory view; removes date-gated gates; refocuses on production agent specs and demo validation.
CI and Config Updates
.github/workflows/test.yml, .gitignore, .markdownlint-cli2.jsonc, .prettierignore, package.json
Adds ImageMagick and Playwright provisioning; extends ignore rules for demo-output/, audio artifacts, and skill symlinks; adds playwright devDependency and simulation/onboarding scripts.
History and Demo Assets
history/lp-demo/w00/, history/lp-demo/w01/, assets/memory-stores-screenshots/manifest.json
Seeds history with synthetic analytics reasoning and baseline week-0/w01 data; adds memory screenshot manifest for proof artifacts.
Demo Output Index
demo-output/landing-page/agents.json, demo-output/landing-page/brand.json, demo-output/landing-page/manifest.json, demo-output/landing-page/INDEX.md
Establishes demo-output as a standalone artifact zone with localized agent configs, brand refs, manifest schema, and navigation guide for reviewers.

Sequence Diagram(s)

sequenceDiagram
    participant Operator as Operator
    participant SimLp as run-simulation-lp
    participant RunSim as runSimulation
    participant SimCouncil as sim-council.md
    participant Council as Managed<br/>Agent Council<br/>(9 agents)
    participant MemStore as Memory Stores
    participant CaptureBridge as sim-capture-bridge
    participant Browser as browser-use

    Operator->>SimLp: bun run sim:lp
    SimLp->>RunSim: runSimulation(lpConfig)
    
    loop for each week (0-10)
        RunSim->>RunSim: Generate synthetic analytics
        RunSim->>RunSim: Capture screenshots (Playwright)
        RunSim->>SimCouncil: runCouncil(week, config)
        
        rect rgba(0, 150, 200, 0.5)
            SimCouncil->>Council: Create planner session
            Council->>MemStore: Read council memory
            Council-->>SimCouncil: Plan output
        end
        
        rect rgba(200, 100, 0, 0.5)
            SimCouncil->>Council: Fan out monitor + 5 critics<br/>(parallel sessions)
            Council->>MemStore: Read/attach per-role memory
            Council-->>SimCouncil: Findings for each role
        end
        
        rect rgba(100, 150, 0, 0.5)
            SimCouncil->>Council: Create redesigner session
            Council-->>SimCouncil: Proposal + decision
        end
        
        rect rgba(150, 50, 150, 0.5)
            SimCouncil->>Council: Create visual-reviewer session
            Council-->>SimCouncil: Visual review markdown
        end
        
        RunSim->>RunSim: Write week artifacts<br/>(analytics, heatmap, review)
        
        alt week is 1, 5, or 10
            RunSim->>CaptureBridge: Emit CAPTURE_TRIGGER
            CaptureBridge->>Browser: Spawn capture-mem-stores
            Browser->>MemStore: Screenshot memory console
            Browser-->>CaptureBridge: Proof artifact
        end
    end
    
    RunSim-->>Operator: Return week HEADs + demo-output/
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • PR #3: Adds webster-planner agent spec and related tests; main PR updates planner spec references and removes planner PRD docs, so they target the same agent-spec surface.
  • PR #6: Adds orch-memory-planner-v1 feature and PRD; main PR deletes that same planner orchestration PRD, indicating a feature scope consolidation or cancellation at the orchestrator level.

Poem

🐰 Whiskers twitching with glee,
Nine agents dance where seven used to be,
Weeks bloom in demo-output's care,
Memory stores remember everywhere,
A simulation hops from week to week,
Until the landing page reaches its peak! 🏞️

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

# Conflicts:
#	.gitignore
#	scripts/__tests__/browser-audit.test.ts
#	scripts/browser-audit.ts
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
context/VISION.md (1)

107-110: ⚠️ Potential issue | 🟡 Minor

Remove the remaining two-substrate execution-plan text.

These lines still talk about 18 sim agents, per-substrate invocations, and “both” simulations, which conflicts with the single-LP scope established above and reintroduces deleted scope into the canonical vision doc.

Also applies to: 154-155

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@context/VISION.md` around lines 107 - 110, Remove the leftover two-substrate
execution-plan text that reintroduces deleted scope: delete or revise the
phrases referencing "T2 (18 sim agent specs)", "T8 (per-substrate invocations)",
and "both simulations" (and the similar lines at 154-155) so the Day 2/Day 3
plan reflects the single-LP scope; replace with a short single-LP equivalent
(e.g., consolidate into a single simulation/task line) and ensure any tokens
like T2/T8 are updated or removed to avoid implying multi-substrate execution.
AGENTS.md (1)

5-10: ⚠️ Potential issue | 🟡 Minor

Fix the agent-spec locations in the operator guide.

This still says the runtime critics live in agents/*.json, but this PR moves them into agents/production/ and agents/simulation/. Leaving the old path here makes the safety boundary much easier to miss.

As per coding guidelines: AGENTS.md: Document agent implementations and capabilities in AGENTS.md.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@AGENTS.md` around lines 5 - 10, Update the operator guide in AGENTS.md to
replace the outdated runtime-critic path `agents/*.json` with the new locations
`agents/production/` and `agents/simulation/`, and adjust the sentence that
references "Claude Managed Agents (specs in `agents/*.json`)" to clearly point
to those two directories; keep the rest of the paragraph distinguishing
"Implementation operators" from "Runtime critics" and retain the cross-reference
to `skills/webster-lp-audit/SKILL.md` for runtime critic guidance.
🟠 Major comments (27)
demo-output/landing-page/w07/analytics.json-14-16 (1)

14-16: ⚠️ Potential issue | 🟠 Major

Fix CTA click count mismatch between aggregate and persona totals

cta_clicks.discovery_call is 331 (Line 15), while persona cta_clicks sum to 332 (131+111+90). This makes the artifact internally inconsistent and unreliable for trend/validation tooling.

Proposed fix (pick one source of truth and align)
   "cta_clicks": {
-    "discovery_call": 331
+    "discovery_call": 332
   },

Based on learnings: Do not fabricate analytics numbers or business stats.

Also applies to: 17-38

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@demo-output/landing-page/w07/analytics.json` around lines 14 - 16, The
aggregate cta_clicks value is inconsistent with the persona totals:
cta_clicks.discovery_call is 331 while the sum of the persona cta_clicks
(131+111+90) is 332; pick a single source of truth and make them match—either
update cta_clicks.discovery_call to 332 to reflect the persona totals, or adjust
the persona counts so their sum equals 331—and apply the same reconciliation to
the other similar cta_clicks blocks referenced in the file (the persona
cta_clicks sections).
demo-output/landing-page/w07/analytics.json-92-95 (1)

92-95: ⚠️ Potential issue | 🟠 Major

Use a consistent event metric key (cta_clicks)

Line 93 uses cta_click (singular), but the payload uses cta_clicks elsewhere (Line 14, Line 22/29/36). This can break downstream parsing/grouping keyed on metric names.

Proposed fix
-      "metric": "cta_click",
+      "metric": "cta_clicks",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@demo-output/landing-page/w07/analytics.json` around lines 92 - 95, The metric
key in the JSON snippet uses "cta_click" but the rest of the payload uses
"cta_clicks", so update the "metric" field value from "cta_click" to
"cta_clicks" (the "metric" property in the object shown with version_sha
"synthetic-lp-w07-f52d97d3-ui-adjusted") to match other occurrences (lines
showing "cta_clicks" at positions referenced) and ensure downstream parsers
aggregate correctly; verify no other objects still use the singular form.
scripts/emit-memory-screenshot-manifest.ts-38-39 (1)

38-39: ⚠️ Potential issue | 🟠 Major

Do not persist absolute filesystem paths in manifest entries.

Lines 39 and 52 serialize machine-local absolute paths into versioned artifacts, which makes manifests non-portable and leaks local directory structure.

Proposed fix (store root-relative paths, keep absolute only for stat)
-import { dirname, resolve } from "node:path";
+import { dirname, join, resolve } from "node:path";
@@
-      const path = resolve(dir, file);
-      screenshots.push({ substrate, week, path, bytes: statSync(path).size });
+      const absolutePath = resolve(dir, file);
+      screenshots.push({
+        substrate,
+        week,
+        path: join(substrate, file),
+        bytes: statSync(absolutePath).size,
+      });
@@
-          const path = resolve(manualDir, file);
-          return { substrate: "manual", week: null, path, bytes: statSync(path).size };
+          const absolutePath = resolve(manualDir, file);
+          return {
+            substrate: "manual",
+            week: null,
+            path: join("manual", file),
+            bytes: statSync(absolutePath).size,
+          };

Also applies to: 51-53

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/emit-memory-screenshot-manifest.ts` around lines 38 - 39, The
manifest entries currently store machine-local absolute paths (variable path
built from resolve(dir, file)) which leaks local filesystem layout; change the
logic that pushes into screenshots so you call statSync using the absolute
resolved path but store a root-relative path string instead (e.g., compute
relPath = relative(manifestRoot, path) or path.relative(dir, file) and push {
substrate, week, path: relPath, bytes: statSync(path).size } into screenshots),
and apply the same change for the second occurrence that serializes paths so the
manifest contains relative paths while stat still uses the absolute resolved
path.
demo-output/landing-page/w03/analytics.json-14-16 (1)

14-16: ⚠️ Potential issue | 🟠 Major

Correct the CTA count inconsistency in this week’s analytics.

Line 15 is 331, while persona CTA clicks on Lines 22, 29, and 36 total 332. Line 94 repeats 331, so aggregate/event and persona data disagree.

Also applies to: 17-38, 91-95

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@demo-output/landing-page/w03/analytics.json` around lines 14 - 16, The
aggregate CTA count under "cta_clicks.discovery_call" is inconsistent with the
sum of persona CTA clicks (aggregate shows 331 while persona entries total 332);
update the JSON so "cta_clicks.discovery_call" (and any repeated aggregate
occurrences) match the persona-level total (or adjust the persona entries if the
correct total is 331), ensuring all instances of the aggregate key
"cta_clicks.discovery_call" and the repeated aggregate value near the end of the
file are made consistent with the persona CTA counts.
demo-output/landing-page/w04/analytics.json-14-16 (1)

14-16: ⚠️ Potential issue | 🟠 Major

Reconcile CTA totals across persona and aggregate fields.

Line 15 reports 344, but persona CTA clicks on Lines 22, 29, and 36 add up to 343. Line 94 mirrors the aggregate 344, so this fixture currently has conflicting counts.

Also applies to: 17-38, 91-95

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@demo-output/landing-page/w04/analytics.json` around lines 14 - 16, The
aggregate CTA count "cta_clicks.discovery_call" (344) conflicts with the sum of
persona CTA clicks (which totals 343); update the fixture so the aggregate
equals the sum of the persona fields: compute the sum of the persona keys under
CTA (lines 17-38) and set "cta_clicks.discovery_call" and its mirrored aggregate
(lines 91-95) to that computed total (or, if the aggregate is correct, adjust
the persona entry values to sum to 344) so all three places are consistent.
demo-output/landing-page/w01/analytics.json-14-16 (1)

14-16: ⚠️ Potential issue | 🟠 Major

Fix CTA totals mismatch between aggregate and persona/event metrics.

Line 15 reports 314, but persona CTA clicks on Lines 22, 29, and 36 sum to 315. Line 94 also repeats 314. This fixture is internally inconsistent and should be regenerated from a single source of truth.

Also applies to: 17-38, 91-95

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@demo-output/landing-page/w01/analytics.json` around lines 14 - 16, The CTA
totals in the fixture are inconsistent: the aggregate key
"cta_clicks.discovery_call" (value 314 at the top and repeated later) does not
equal the sum of persona CTA clicks (which total 315). Regenerate or recalc the
fixture from the single source of truth and update every instance of
"cta_clicks.discovery_call" (both the aggregate entry and the repeated value
near the end) so the aggregate equals the sum of the persona/event breakdowns;
ensure any other related blocks in the same JSON (the ranges mentioned around
the persona/event sections) are updated to the same authoritative value.
history/lp-demo/w01/analytics.json-7-7 (1)

7-7: ⚠️ Potential issue | 🟠 Major

Persona session totals do not match top-level sessions.

Line 7 reports sessions: 5034, but persona sessions sum to 5035 (1863 + 1712 + 1460). Please reconcile this before merge to keep weekly artifacts internally consistent.

Also applies to: 20-21, 27-28, 34-35

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@history/lp-demo/w01/analytics.json` at line 7, The top-level "sessions" value
(currently 5034) is inconsistent with the persona totals (1863 + 1712 + 1460 =
5035); update the document so the "sessions" key matches the sum of persona
session counts (or adjust the persona counts if those are wrong) and apply the
same reconciliation to the other weeks called out (weeks containing the similar
mismatches at the other instances), ensuring the top-level "sessions" value
equals the sum of the persona entries.
demo-output/landing-page/w02/heatmap.json-105-117 (1)

105-117: ⚠️ Potential issue | 🟠 Major

Section identifier drift (contact vs cta) risks broken joins.

These regions are labeled as contact, but their metric reasons clearly reference cta stats. Align IDs/labels with the analytics section key to prevent mapping mismatches.

Proposed consistency fix
-          "id": "contact",
-          "label": "contact",
+          "id": "cta",
+          "label": "cta",
...
-          "id": "contact",
-          "label": "contact",
+          "id": "cta",
+          "label": "cta",
...
-          "id": "contact",
-          "label": "contact",
+          "id": "cta",
+          "label": "cta",

Also applies to: 205-216, 304-315

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@demo-output/landing-page/w02/heatmap.json` around lines 105 - 117, Some
heatmap sections have mismatched identifiers: the JSON objects with "id":
"contact" and "label": "contact" include "reason" fields that reference "cta"
metrics (e.g., reason: "cta: views=..."), which will break analytics joins;
update those objects (the ones containing "id": "contact" / "label": "contact")
to consistently use the analytics key "cta" (change "id" and "label" to "cta")
OR change the "reason" metric prefix to "contact" so the id/label and reason
match, and apply the same fix to the other similar objects noted in the diff.
demo-output/landing-page/w05/analytics.json-15-15 (1)

15-15: ⚠️ Potential issue | 🟠 Major

CTA aggregate conflicts with persona click totals.

Top-level discovery_call clicks are 332 (Line 15), while persona clicks total 334 (129 + 113 + 92). This should be reconciled to avoid inconsistent analytics rollups.

Also applies to: 22-23, 29-30, 36-37

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@demo-output/landing-page/w05/analytics.json` at line 15, The top-level
"discovery_call" value (332) does not match the sum of persona-level clicks (129
+ 113 + 92 = 334); update the analytics generation so the top-level
"discovery_call" is computed from the persona entries (or vice versa) to keep
rollups consistent: locate the "discovery_call" top-level key and the persona
click objects in this JSON, recompute the aggregate as the sum of the persona
click counts (or deduplicate overlapping counts if clicks are double-counted)
and replace the mismatched 332 with the correct aggregated value, and ensure the
same fix is applied for the other mismatched pairs referenced (lines with the
same pattern).
demo-landing-page/ugly/index.html-53-60 (1)

53-60: ⚠️ Potential issue | 🟠 Major

Remove or source the hard-coded business statistics.

The numeric claims in this block are presented as factual metrics without attribution. If they are synthetic placeholders, mark them clearly as synthetic or replace with non-quantified copy.

Based on learnings: Do not fabricate analytics numbers or business stats.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@demo-landing-page/ugly/index.html` around lines 53 - 60, The three hard-coded
stats in the article block (the <strong> elements showing "3×", "67%", and
"$240K") must be removed or clearly sourced: either replace those numeric values
with non-quantified copy (e.g., "improved retention", "patients leave when care
feels inconsistent", "lost revenue per practitioner") or add a citation/footnote
and label them as synthetic/placeholders; update the three corresponding
<article> elements containing those <strong> nodes so they no longer present
unattributed factual metrics without sourcing.
agents/simulation/webster-lp-sim-redesigner.json-5-5 (1)

5-5: ⚠️ Potential issue | 🟠 Major

Add a dedicated # Scope block to the redesigner system prompt.

Line 5 defines reads/tasks/outputs, but it does not explicitly declare exact ownership boundaries versus other agents in a Scope section.

As per coding guidelines: "Include a scope section in system prompts that EXACTLY states what this agent is responsible for, with no overlap with other agents."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/simulation/webster-lp-sim-redesigner.json` at line 5, The system
prompt in the "system" string for Webster's lp simulation redesigner is missing
an explicit "# Scope" block that defines exact ownership boundaries; add a
dedicated "# Scope" section near the top of that "system" string (inside
agents/simulation/webster-lp-sim-redesigner.json) that enumerates this agent's
sole responsibilities (e.g., reads: specific files listed, task: produce
proposal.md and decision.json, outputs: push/create files via GitHub MCP) and
explicitly states what other agents or systems must NOT do (no git, no external
fetches, no overlapping judging duties with critics/monitor), ensuring wording
matches the existing instructions and avoids overlap with other agents.
agents/simulation/webster-lp-sim-planner.json-5-5 (1)

5-5: ⚠️ Potential issue | 🟠 Major

Add an explicit # Scope section to the planner system prompt.

Line 5 has Bootstrap/Task/Output, but no dedicated Scope block that defines planner-only ownership and explicit out-of-scope boundaries.

As per coding guidelines: "Include a scope section in system prompts that EXACTLY states what this agent is responsible for, with no overlap with other agents."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/simulation/webster-lp-sim-planner.json` at line 5, Add a dedicated "#
Scope" section to the "system" value in webster-lp-sim-planner.json that
explicitly enumerates this agent's responsibilities (e.g., choosing weekly
experiment direction, reading specified files via GitHub MCP, producing the
plan.md JSON with fields classification/next_action/direction_hint/optional
new_critic_request/rationale) and lists clear out-of-scope boundaries (e.g., no
git operations, no file IO outside GitHub MCP, no publishing changes, no critic
decision-making, and no external network fetches); place it near the top of the
prompt (alongside Bootstrap/Task/Output) so it is plainly visible and refers to
the same planner role and the required reads (demo-landing-page/context/* and
context/sim/lp/planner/notes.md) to avoid overlap with other agents.
scripts/onboarding/verify-env.ts-57-100 (1)

57-100: ⚠️ Potential issue | 🟠 Major

Add request timeouts to provider verification calls.

The fetch calls on lines 57, 77, and 98 lack timeouts, allowing the onboarding process to hang indefinitely if network requests stall.

Proposed fix
+const VERIFY_TIMEOUT_MS = 10_000;
+
+function fetchWithTimeout(url: string, init: RequestInit): Promise<Response> {
+  return fetch(url, { ...init, signal: AbortSignal.timeout(VERIFY_TIMEOUT_MS) });
+}
+
 async function verifyAnthropic(key: string): Promise<VerifyResult> {
@@
-    const res = await fetch("https://api.anthropic.com/v1/models", {
+    const res = await fetchWithTimeout("https://api.anthropic.com/v1/models", {
       headers: {
         "x-api-key": key,
         "anthropic-version": "2023-06-01",
       },
     });
@@
 async function verifyGitHub(token: string): Promise<VerifyResult> {
@@
-    const res = await fetch("https://api.github.com/user", {
+    const res = await fetchWithTimeout("https://api.github.com/user", {
       headers: {
         Authorization: `Bearer ${token}`,
         Accept: "application/vnd.github+json",
         "X-GitHub-Api-Version": "2022-11-28",
       },
     });
@@
 async function verifyCloudflare(token: string): Promise<VerifyResult> {
@@
-    const res = await fetch("https://api.cloudflare.com/client/v4/user/tokens/verify", {
+    const res = await fetchWithTimeout("https://api.cloudflare.com/client/v4/user/tokens/verify", {
       headers: { Authorization: `Bearer ${token}` },
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/onboarding/verify-env.ts` around lines 57 - 100, Add a per-request
timeout using AbortController for the three provider verification fetches (the
Anthropic fetch block in verifyAnthropic, verifyGitHub, and verifyCloudflare):
create an AbortController, start a setTimeout to call controller.abort() after a
reasonable timeout (e.g., 5000ms), pass controller.signal to fetch, and ensure
you clear the timeout (clearTimeout) in a finally block so it doesn’t leak;
preserve existing error handling but ensure aborted requests surface as fetch
errors so the existing catch returns a failed VerifyResult.
demo-landing-page/ugly/style.css-86-87 (1)

86-87: ⚠️ Potential issue | 🟠 Major

Avoid hotlinking hero media from an external URL.

Loading the hero image from Unsplash makes rendering dependent on third-party availability and network policy, which can cause flaky captures and nondeterministic output.

Proposed fix
-    url("https://images.unsplash.com/photo-1624616802182-57737fa83971?w=1920&h=1080&fit=crop&q=80&auto=format")
-      center/cover;
+    url("./assets/hero-background.webp") center/cover;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@demo-landing-page/ugly/style.css` around lines 86 - 87, The CSS is hotlinking
an external Unsplash URL (url("https://images.unsplash.com/...")) for the hero
background which causes flaky renders; download and check the image into the
repo (e.g., demo-landing-page/assets/ or static/), update the style.css rule
that currently uses url("https://images.unsplash.com/...") to reference a
relative local path (e.g., url("/assets/hero.jpg")) and ensure your build/static
assets pipeline serves that file so the hero background uses the bundled asset
instead of the external URL.
agents/simulation/webster-lp-sim-visual-reviewer.json-5-5 (1)

5-5: ⚠️ Potential issue | 🟠 Major

Add an explicit # Scope section to the system prompt.

This prompt has bootstrap/task/output, but it does not include a dedicated scope block that explicitly bounds responsibilities and overlap, which is required for agent-spec compliance.

As per coding guidelines: "Include a scope section in system prompts that EXACTLY states what this agent is responsible for, with no overlap with other agents."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/simulation/webster-lp-sim-visual-reviewer.json` at line 5, The system
prompt string in webster-lp-sim-visual-reviewer.json is missing a dedicated "#
Scope" section; update the "system" value (the long prompt text) to add a clear
"# Scope" block that explicitly and concisely states only this agent’s
responsibilities (visual review of LP screenshots, required GitHub MCP reads,
judging against brand/personas, and writing the visual-review.md) and explicitly
excludes any other agents’ duties (e.g., editing content, running git, or making
network requests); place the block near the top of the existing prompt so
functions like the bootstrap/task/output sections remain, and ensure the
language matches the agent-spec requirement of EXACTLY stating responsibilities
with no overlap.
scripts/onboarding/scaffold-repo.ts-246-258 (1)

246-258: ⚠️ Potential issue | 🟠 Major

The fixed temp worktree breaks reruns after a partial scaffold.

tmp/onboarding-scaffold/<repoName> is reused across runs, so any leftover .git metadata from a prior attempt makes git remote add origin and later commands fail. That means the advertised idempotent path is not actually repeatable for the same repo name.

Also applies to: 307-312

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/onboarding/scaffold-repo.ts` around lines 246 - 258, The
commitAndPush function fails on reruns because leftover .git metadata in the
reused tmp/onboarding-scaffold/<repoName> directory prevents git remote
add/push; before running git commands, detect and remove any existing .git
directory (e.g. using fs.existsSync(path.join(workDir, ".git")) and
fs.rmSync(..., { recursive: true, force: true })) so git init can run cleanly,
then proceed with run(["git", "init", ...]) etc.; apply the same
pre-check-and-remove logic to the other scaffold git block referenced around the
later occurrence (the block at 307-312) so repeated runs are idempotent.
scripts/register-sim-agents.ts-38-42 (1)

38-42: ⚠️ Potential issue | 🟠 Major

Validate the full sim-spec set before the first API write.

Right now filename filtering is the only preflight. If one JSON has a bad name or other malformed shape, this loop can register some agents successfully and only fail afterward while deriving the manifest, leaving a partial remote state behind. Parse/validate the entire set up front, then start the POSTs.

Based on learnings: Before committing a new agent spec: validate against schema, ensure registration is idempotent, and check name collision across all specs in both agents sets.

Also applies to: 89-98

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/register-sim-agents.ts` around lines 38 - 42, The loadSimAgentSpecs
function currently streams JSON files into POSTs without full preflight
validation; update loadSimAgentSpecs to read and parse all candidate files
first, validate each parsed object against the AgentSpec schema (ensure required
fields like name, id, etc.), accumulate and fail fast on any validation errors,
detect duplicate names within the parsed set, then perform idempotent
registration POSTs (use a safe upsert or check-exists before create) and check
for name collisions against the existing remote agents before any network
writes; also apply the same preflight validation logic to the registration loop
referenced around the register/POST logic (the code block around lines ~89-98)
so the manifest derivation and remote writes only run after the complete
validated set is confirmed.
scripts/onboarding/scaffold-repo.ts-196-199 (1)

196-199: ⚠️ Potential issue | 🟠 Major

Update scaffold dependencies to Astro 6.

This generator creates projects with astro@^4.0.0 and @astrojs/cloudflare@^9.0.0, violating the locked stack (Astro 6). Update to astro@^6.0.0 and @astrojs/cloudflare@^13.0.0 to align with project standards.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/onboarding/scaffold-repo.ts` around lines 196 - 199, Update the
hardcoded dependency versions in the dependencies object: change "astro" from
"^4.0.0" to "^6.0.0" and change "@astrojs/cloudflare" from "^9.0.0" to "^13.0.0"
so the scaffolded projects use Astro 6; locate the dependencies object (the
dependencies symbol in scaffold-repo.ts) and replace the version strings for the
astro and `@astrojs/cloudflare` entries accordingly.
context/ONBOARDING-CASE-STUDY.md-25-27 (1)

25-27: ⚠️ Potential issue | 🟠 Major

Update this spec to the current production agent inventory.

This doc still says onboarding provisions/verifies 10 production agents via agents/*.json, but the repo contract in this PR is 9 protected production specs under agents/production/. If someone follows this literally, the verify step and the narration both drift from the actual production set.

Based on learnings: The 9 specs in agents/production/ are the production set for Nicolette's real weekly council — do not modify them.

Also applies to: 54-55, 130-137, 217-223

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@context/ONBOARDING-CASE-STUDY.md` around lines 25 - 27, The spec text
incorrectly claims onboarding provisions/verifies 10 production agents via
agents/*.json; update every referenced count to reflect the actual 9 protected
production specs present under agents/production/ (e.g., change "10 production
agents" → "9 production agents" in the Q3–Q5 table and the other occurrences you
noted around lines 54–55, 130–137, and 217–223), keep the repository's 9 JSON
specs untouched (do not modify agents/production/*), and ensure any narration or
verification steps (mentions of agents/*.json, "production set", or verify step)
consistently reference the 9-agent inventory.
demo-output/landing-page/w04/visual-review.md-6-13 (1)

6-13: ⚠️ Potential issue | 🟠 Major

Keep this bundle on the week-NN contract.

The new demo-manifest pipeline only scans directories named week-\\d{2}. A demo-output/landing-page/w04/... artifact will be invisible to build-demo-manifest.ts, so this review bundle will be skipped from the manifest/final-sheet flow.

Based on learnings: Structure new hackathon simulation output as demo-output/landing-page/week-NN/... following the asset-bundle contract defined in context/EXPANSION-TASKS.md T7–T9.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@demo-output/landing-page/w04/visual-review.md` around lines 6 - 13, The
bundle is placed at demo-output/landing-page/w04 which doesn't match the
pipeline's week-\d{2} scan; move or rename the artifact directory to follow the
asset-bundle contract so it becomes demo-output/landing-page/week-04/... (or
create a week-04 symlink) and ensure any generated references (manifest entries)
point to the new path; if you prefer changing code instead, update the scanner
in build-demo-manifest.ts to accept the current pattern, but the preferred fix
is to restructure output to demo-output/landing-page/week-NN per
context/EXPANSION-TASKS.md T7–T9 so the bundle is discovered by
build-demo-manifest.ts.
scripts/build-demo-manifest.ts-240-246 (1)

240-246: ⚠️ Potential issue | 🟠 Major

Validate final_sheet with the other top-level absolute paths.

DEMO_MANIFEST_SCHEMA requires final_sheet to be absolute, but validateDemoManifest() never checks it. A caller can pass a relative final_sheet, get a clean validation result, and only fail later when a consumer treats the manifest as schema-valid.

Proposed fix
   if (
     !manifest.substrate ||
     !isAbsolute(manifest.output_dir) ||
-    !isAbsolute(manifest.manifest_path)
+    !isAbsolute(manifest.manifest_path) ||
+    !isAbsolute(manifest.final_sheet)
   ) {
     throw new Error("manifest paths must be absolute and substrate must be present");
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/build-demo-manifest.ts` around lines 240 - 246, The validation block
that currently checks manifest.substrate, manifest.output_dir, and
manifest.manifest_path must also validate manifest.final_sheet is present and
absolute; update the condition (in validateDemoManifest / the manifest
validation block that throws "manifest paths must be absolute and substrate must
be present") to include !isAbsolute(manifest.final_sheet) (and check presence if
needed) so DEMO_MANIFEST_SCHEMA's requirement for an absolute final_sheet is
enforced at validation time.
scripts/__tests__/register-sim-agents.test.ts-34-35 (1)

34-35: ⚠️ Potential issue | 🟠 Major

Use addFormats(ajv) instead of addFormats.default(ajv).

The import import addFormats from "ajv-formats" provides the default export directly. Calling .default on it bypasses the intended API and relies on module-interop quirks, which can cause runtime errors.

Proposed fix
-    addFormats.default(ajv);
+    addFormats(ajv);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/__tests__/register-sim-agents.test.ts` around lines 34 - 35, The test
creates an Ajv2020 instance as ajv and currently calls addFormats.default(ajv),
which relies on interop quirks; change the call to use the default export
directly by invoking addFormats(ajv) instead. Locate the Ajv2020 instantiation
(const ajv = new Ajv2020(...)) and replace the addFormats.default usage with
addFormats(ajv) so the addFormats import is used via its intended API.
scripts/__tests__/register-sim-agents.test.ts-76-151 (1)

76-151: ⚠️ Potential issue | 🟠 Major

Add assertions for the critical API contract: beta header and pagination.

The tests prove create-vs-reuse behavior but have two gaps that would allow regressions:

  1. Header validation missing: No test asserts that anthropic-beta: managed-agents-2026-04-01 is sent with requests; removing this required header would not fail the tests.

  2. Pagination coverage missing: Both test mocks return has_more: false, so if findAgentByName() stops handling next_page or last_id cursors, the tests would still pass. Add a test case that mocks has_more: true with next_page to exercise the loop in findAgentByName().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/__tests__/register-sim-agents.test.ts` around lines 76 - 151, The
tests in register-sim-agents.test.ts miss asserting the required API header and
pagination behavior; update the existing fetch mocks (used when testing
registerSimAgents and findAgentByName) to assert that requests include the
header "anthropic-beta: managed-agents-2026-04-01" (verify init.headers or
RequestInit passed into globalThis.fetch) and add a new test (or extend an
existing one) that simulates paginated responses by returning { data: [...],
has_more: true, next_page: "cursor1" } on the first GET and a final page with
has_more: false on the second GET so findAgentByName's loop is exercised; ensure
the POST path still checks for header when creating agents and keep using
loadSimAgentSpecs(), registerSimAgents, and the same globalThis.fetch override
pattern to locate the logic.
scripts/run-simulation.ts-181-198 (1)

181-198: ⚠️ Potential issue | 🟠 Major

Fail the run when a memory-summary write is rejected.

These POSTs ignore res.ok. A 401/404/5xx leaves the simulation green while no summary document is persisted, which breaks the state this step is supposed to carry across weeks.

Proposed fix
-    await fetch(`${API}/memory_stores/${storeId}/documents`, {
+    const res = await fetch(`${API}/memory_stores/${storeId}/documents`, {
       method: "POST",
       headers: {
         "x-api-key": apiKey,
         "anthropic-version": VERSION,
         "anthropic-beta": BETA,
@@
         },
       }),
     });
+    if (!res.ok) {
+      throw new Error(
+        `memory summary write failed for ${role} (${res.status}): ${await res.text()}`,
+      );
+    }
   }
 }
Based on learnings: Do not silently catch errors to make things look green; surface `[STUCK]` prefix if a path is not clear.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/run-simulation.ts` around lines 181 - 198, The POST to
`${API}/memory_stores/${storeId}/documents` currently ignores the HTTP response;
change the call in scripts/run-simulation.ts so you await the fetch result into
a variable (the fetch that posts summaries[role]) and check res.ok — if not ok,
read the response text/JSON and throw an Error that includes the status and body
so the run fails; ensure the thrown error or logged message includes the
“[STUCK]” prefix when the write path is unclear (so the simulation surface is
red/failed rather than silently passing).
scripts/run-simulation.ts-91-96 (1)

91-96: ⚠️ Potential issue | 🟠 Major

Do not forward ANTHROPIC_API_KEY into prompts/sim-council.md.

The spawned env inherits the parent ANTHROPIC_API_KEY, but the prompt aborts when that variable is exported. In the default path, enabling memory-summary writes via env breaks the council step immediately.

Proposed fix
 function defaultRunCouncil(
   env: Record<string, string>,
   command = "bun scripts/run-markdown-bash.ts prompts/sim-council.md",
 ): void {
-  execFileSync("bash", ["-lc", command], { env: { ...process.env, ...env }, stdio: "inherit" });
+  const childEnv = { ...process.env, ...env };
+  delete childEnv.ANTHROPIC_API_KEY;
+  execFileSync("bash", ["-lc", command], { env: childEnv, stdio: "inherit" });
 }
Based on learnings: Applies to `prompts/sim-council.md` : Use `prompts/sim-council.md` as the simulation orchestrator (a fork of the production orchestrator) for hackathon expansion.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/run-simulation.ts` around lines 91 - 96, defaultRunCouncil currently
forwards the parent ANTHROPIC_API_KEY into the child process, which causes
prompts/sim-council.md to abort; fix by creating a child environment object
(merge process.env and the passed env) and explicitly remove/undefine the
ANTHROPIC_API_KEY key before calling execFileSync; update the defaultRunCouncil
implementation (referencing function defaultRunCouncil, execFileSync, command,
and prompts/sim-council.md) to pass that sanitized env to the child process.
scripts/context-schema.ts-153-161 (1)

153-161: ⚠️ Potential issue | 🟠 Major

Convert file-read / JSON-parse failures into validation errors.

validateContextDirectory() throws on ENOENT or malformed personas.json / brand.json, so the CLI exits before printing the per-directory report. Return those as collected errors instead of crashing the whole validation pass.

Proposed fix
 export function validateContextDirectory(contextDir: string): string[] {
-  const business = readFileSync(`${contextDir}/business.md`, "utf8");
-  const personas = JSON.parse(readFileSync(`${contextDir}/personas.json`, "utf8"));
-  const brand = JSON.parse(readFileSync(`${contextDir}/brand.json`, "utf8"));
-  return [
-    ...validateBusinessMarkdown(business),
-    ...validatePersonas(personas),
-    ...validateBrandContext(brand),
-  ];
+  const errors: string[] = [];
+  let business: string | undefined;
+  let personas: unknown;
+  let brand: unknown;
+
+  try {
+    business = readFileSync(`${contextDir}/business.md`, "utf8");
+  } catch (error) {
+    errors.push(`business.md could not be read: ${(error as Error).message}`);
+  }
+  try {
+    personas = JSON.parse(readFileSync(`${contextDir}/personas.json`, "utf8"));
+  } catch (error) {
+    errors.push(`personas.json is missing or invalid JSON: ${(error as Error).message}`);
+  }
+  try {
+    brand = JSON.parse(readFileSync(`${contextDir}/brand.json`, "utf8"));
+  } catch (error) {
+    errors.push(`brand.json is missing or invalid JSON: ${(error as Error).message}`);
+  }
+
+  return [
+    ...errors,
+    ...(business ? validateBusinessMarkdown(business) : []),
+    ...(personas === undefined ? [] : validatePersonas(personas)),
+    ...(brand === undefined ? [] : validateBrandContext(brand)),
+  ];
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/context-schema.ts` around lines 153 - 161, The function
validateContextDirectory currently throws on missing or malformed files; instead
wrap the reads and JSON.parse calls for business.md, personas.json, and
brand.json inside try/catch blocks in validateContextDirectory so any ENOENT or
parse errors are caught and converted into returned validation error messages
(e.g., push human-readable strings into the returned array). Keep calling
validateBusinessMarkdown(business), validatePersonas(personas), and
validateBrandContext(brand) when a file successfully loads, but if a read/parse
fails for a given file, add a clear error entry to the result array describing
which file failed and the error message rather than letting the exception
propagate.
scripts/capture-mem-stores.ts-61-84 (1)

61-84: ⚠️ Potential issue | 🟠 Major

Validate console_url and output before driving a real browser profile.

This accepts any URL and any destination path. Because the script opens that URL in a logged-in default browser profile and writes to the requested file, a forged CAPTURE_TRIGGER can browse arbitrary sites and overwrite arbitrary files. Please restrict console_url to the Anthropic memory-stores page and output to the expected screenshot directory.

Proposed fix
 function parsePayload(raw: string): CaptureTriggerPayload {
   const parsed = JSON.parse(raw) as Partial<CaptureTriggerPayload>;
@@
   if (!parsed.output) {
     throw new Error("payload.output is required");
   }
   if (!parsed.console_url) {
     throw new Error("payload.console_url is required");
   }
+  const url = new URL(parsed.console_url);
+  if (
+    url.origin !== "https://console.anthropic.com" ||
+    !url.pathname.startsWith("/settings/memory-stores")
+  ) {
+    throw new Error("payload.console_url must target Anthropic Console memory stores");
+  }
+  const normalizedOutput = parsed.output.replaceAll("\\", "/");
+  if (!normalizedOutput.startsWith("assets/memory-stores-screenshots/")) {
+    throw new Error("payload.output must stay under assets/memory-stores-screenshots/");
+  }
   return {
     event: parsed.event,
     substrate: parsed.substrate,
     week: parsed.week,
-    output: parsed.output,
-    console_url: parsed.console_url,
+    output: normalizedOutput,
+    console_url: parsed.console_url,
   };
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/capture-mem-stores.ts` around lines 61 - 84, The parsePayload
function currently accepts any console_url and output; validate that
parsed.console_url exactly matches (or matches a strict regexp for) the
Anthropic memory-stores page URL used by the app (e.g., host and path for the
memory-stores console) and throw an error if it does not, and validate
parsed.output to ensure it is inside the allowed screenshots directory (no
absolute paths, no .. segments, and only allow safe filenames/extensions such as
alphanumeric + dashes/underscores with .png/.jpg); update parsePayload to
perform these checks and return only when both validations pass, otherwise throw
descriptive errors.
🧹 Nitpick comments (4)
scripts/__tests__/validate-agents.test.ts (1)

39-42: Make seo-critic fixture selection deterministic.

Basename lookup can become ambiguous if another seo-critic.json is added later (e.g., archive/mirror). Prefer targeting the canonical production path directly.

Proposed change
-  const seoCriticPath = agentFiles.find((f) => f.endsWith("seo-critic.json"));
-  if (!seoCriticPath) {
-    throw new Error("seo-critic.json not found under agents/");
-  }
+  const seoCriticPath = join(agentsDir, "production", "seo-critic.json");
+  if (!agentFiles.includes(seoCriticPath)) {
+    throw new Error("agents/production/seo-critic.json not found under agents/");
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/__tests__/validate-agents.test.ts` around lines 39 - 42, The current
selection uses agentFiles.find((f) => f.endsWith("seo-critic.json")) which is
ambiguous if multiple files share that basename; change the lookup to target the
canonical production path explicitly (for example look for the full relative
path that includes the agents/seo-critic directory) so seoCriticPath is found
deterministically instead of by basename.
demo-landing-page/ugly/style.css (1)

18-20: Respect reduced-motion preferences for smooth scrolling.

Add a reduced-motion override so users who disable motion aren’t forced into animated scroll behavior.

Proposed fix
 html {
   scroll-behavior: smooth;
 }
+
+@media (prefers-reduced-motion: reduce) {
+  html {
+    scroll-behavior: auto;
+  }
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@demo-landing-page/ugly/style.css` around lines 18 - 20, The current CSS rule
forces smooth scrolling on the html element via the scroll-behavior property;
add a prefers-reduced-motion media-query override so users who opt out of motion
get non-animated scrolling (override html { scroll-behavior: smooth; } with html
{ scroll-behavior: auto; } inside a `@media` (prefers-reduced-motion: reduce)
block).
scripts/__tests__/sim-council.test.ts (1)

74-82: Avoid VCS-state assertions in unit tests.

Line 74–82 makes the suite depend on git working-tree state, not sim-council behavior. That is brittle in non-repo test environments and will fail on legitimate future edits to prompts/second-wbs-session.md.

Proposed change
-  test("does not modify production weekly orchestrator", () => {
-    const diff = Bun.spawnSync(
-      ["git", "diff", "--name-only", "--", "prompts/second-wbs-session.md"],
-      {
-        stdout: "pipe",
-      },
-    );
-    expect(new TextDecoder().decode(diff.stdout).trim()).toBe("");
-  });

Move this guard to CI policy (PR-level check), and keep this file focused on prompts/sim-council.md behavior/invariants.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/__tests__/sim-council.test.ts` around lines 74 - 82, The test "does
not modify production weekly orchestrator" in
scripts/__tests__/sim-council.test.ts relies on VCS state via
Bun.spawnSync(["git","diff",...]) and should be removed from the unit test
suite; delete the block that shells out to git (the test starting with
test("does not modify production weekly orchestrator", ...)) and instead enforce
the guard as a CI/PR-level check (move the git-diff assertion into your CI
pipeline or a separate pre-merge script), keeping this test file focused only on
sim-council behavior and invariants (e.g., tests around prompts/sim-council.md).
demo-output/landing-page/agents.json (1)

1-1: Reduce tool privileges for local-lp-redesigner.

Bash enables arbitrary shell execution, while this role’s stated workflow is primarily file reads/writes/edits. Dropping Bash (unless there is a hard requirement) lowers prompt-injection and accidental command risk.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@demo-output/landing-page/agents.json` at line 1, The agent spec for
local-lp-redesigner grants an unnecessary high-risk tool ("Bash"); remove "Bash"
from the tools array in the local-lp-redesigner agent definition (the object
keyed "local-lp-redesigner") so it only lists safe operations
("Read","Write","Edit","MultiEdit") unless there's an explicit, documented need
for shell access; after editing the tools array, run any schema/validation
checks or CI that validate agents.json to ensure the file stays well-formed and
that local-lp-redesigner still meets workflow requirements.

Comment on lines +50 to +81
export async function processBridgeInput(input: string, deps: BridgeDeps = {}): Promise<void> {
const spawnCapture = deps.spawnCapture ?? defaultSpawnCapture;
const writeStdout = deps.writeStdout ?? ((line: string) => process.stdout.write(line));
const lines = input.split(/(?<=\n)/).filter((line) => line.length > 0);
for (const rawLine of lines) {
writeStdout(rawLine);
const line = rawLine.endsWith("\n") ? rawLine.slice(0, -1) : rawLine;
const trigger = parseCaptureTriggerLine(line);
if (trigger) {
await spawnCapture(trigger);
}
}
}

async function readStdin(): Promise<string> {
const chunks: Uint8Array[] = [];
for await (const chunk of Bun.stdin.stream()) {
chunks.push(chunk);
}
const total = chunks.reduce((sum, chunk) => sum + chunk.length, 0);
const merged = new Uint8Array(total);
let offset = 0;
for (const chunk of chunks) {
merged.set(chunk, offset);
offset += chunk.length;
}
return new TextDecoder().decode(merged);
}

if (import.meta.main) {
try {
await processBridgeInput(await readStdin());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Process stdin incrementally instead of buffering to EOF.

This bridge doesn't forward lines or fire captures until readStdin() finishes, so a week-1 or week-5 trigger will be handled after the whole simulation exits. That defeats the point of a mid-run snapshot and can capture the final store state three times instead of the state at each trigger.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/sim-capture-bridge.ts` around lines 50 - 81, The code currently
buffers the entire stdin via readStdin() and only calls processBridgeInput after
EOF; change the import.meta.main path to consume Bun.stdin.stream()
incrementally and handle lines as they arrive: inside the main async block read
chunks from Bun.stdin.stream(), append to a string buffer, split out complete
newline-terminated lines, and for each complete line call the same logic used in
processBridgeInput (writeStdout/defaultWrite and parseCaptureTriggerLine + await
spawnCapture or defaultSpawnCapture) so triggers fire mid-run; keep any leftover
partial line in the buffer until more data arrives and avoid awaiting EOF.
Ensure you reuse parseCaptureTriggerLine, spawnCapture (or defaultSpawnCapture)
and writeStdout/defaults to preserve behavior.

* chore(final-polish): remove video footprint + polish context docs

Remove submission-tooling that does not belong in the public repo:
skills/webster-video/, video/, context/webster-video/, the video
composition prompt, the onboarding case-study, and the
webster-onboarding empire fixture. The HyperFrames render pipeline
that produced the timelapse was always tooling-side, not product.

Polish ARCHITECTURE / FEATURES / DOMAIN-MODEL / VISION / QUALITY-GATES
to the post-eeda2bc shipped state: 9 production Managed Agents
mirrored 1:1 by 9 simulation specs. Reframe Layer 6 (video) as
external submission tooling, not a blocked product layer. Move
visual-design-critic to its true provenance — a W4 genealogy spawn,
not a permanent L2 base agent. Drop deferred / hackathon-crunch
language from the canon docs; that posture is over.

Strip unverifiable projections from README (cost-per-month,
cost-per-run, agency-pricing comparisons). Standardize the
test-count phrasing to "29 test files green via bun run validate".

Tidy dangling references in skills/webster-onboarding and qa-bank.

Validate is green: 176 tests pass, 0 lint warnings, 0 type errors,
0 markdown errors, 19 JSON specs valid, 7 findings files valid.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(final-polish): drop watch-dispatches.sh dead Forge watcher

Forge dispatch watcher with no live consumer post-hackathon.
Only references were in checkpoints/compactions, no callers
in scripts/, prompts/, package.json, or workflows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(final-polish): add judge tour + per-week INDEX

Add a "5-minute judge tour" section to README naming the exact
click-path for evaluation: pitch → INDEX → one week's visual review
→ critic-genealogy.ts → optional live --dry-run.

Add demo-output/landing-page/INDEX.md narrating the 11-week LP
timelapse, one beat per week with classification + links to that
week's screenshots and visual review. Turns scattered week directories
into a scannable evidence path.

Drop the "rendered MP4 hosted externally — link in submission form"
references from README, ARCHITECTURE, and FEATURES. The video link
is owned by the submission form, not the repo, and pointing at a
form judges may not have open is dead weight. The per-week assets
plus INDEX.md carry the visual evidence on their own.

Validate green: 176 tests pass, 0 lint warnings, 0 type errors,
0 markdown errors, 19 specs valid, 7 findings files valid.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@richsak richsak merged commit c313a46 into main Apr 27, 2026
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant