feat(scengen): add crypt-admin binary with graph-first scenario generation (DES-027) by jmf-pobox · Pull Request #47 · punt-labs/cryptd

jmf-pobox · 2026-03-15T21:25:09Z

Summary

New crypt-admin binary with generate, validate, and export subcommands for graph-first scenario authoring (DES-027)
New internal/scengen/ package: Graph types with 6-direction compass, TopologySource interface, TreeSource filesystem adapter (with hub insertion for >5 children), Visitor pattern for content decoration, YAML directory exporter, SQLite store for iterative authoring
YAML directory format: scenarios can be a directory with manifest + region files; scenariodir.Load() tries directory format first, falls back to single-file
Security hardening: path traversal protection in LoadDir, ID collision detection in TreeSource, atomic temp-dir writes in exporter, explicit flag value validation in CLI

Context

Hand-authoring a 200-room scenario in YAML produced 83 broken one-way connections (PR #45). Root cause: YAML embeds edges inside nodes, so each connection must be authored twice. Graph-first generation produces valid bidirectional connections by construction.

Test plan

47 new unit tests across 7 test files (go test -race ./internal/scengen/... ./internal/scenario/... ./internal/scenariodir/...)
Graph validation: bidirectionality, max degree 6, BFS connectivity, unreachable nodes
Tree adapter: simple/nested trees, file skipping, hub insertion for 8+ children, ID collision detection, empty/missing roots
YAML round-trip: Generate → WriteYAMLDir → LoadDir → scenario.Validate
SQLite round-trip: SaveGraph → LoadGraph with meta, nodes, edges, FK/CHECK constraints
Path traversal: region paths cannot escape scenario directory
End-to-end: crypt-admin generate --topology tree → crypt-admin validate
No SQLite in cryptd: go list -deps ./cmd/cryptd | grep sqlite returns empty
make check passes (vet + test + lint + markdownlint on changed files)

🤖 Generated with Claude Code

Note

High Risk
Large, cross-cutting change that adds a new authoring binary with SQLite dependency, introduces a new directory-based scenario format, and modifies core engine combat/leveling/item logic; regressions could affect gameplay balance, scenario loading, and daemon tool compatibility.

Overview
Introduces a new crypt-admin binary (generate, validate, export) for graph-first scenario authoring (DES-027), including optional SQLite-backed working copies and YAML directory export/validation.

Adds support for a directory-based scenario format (manifest + regions/*.yaml) and updates scenario resolution/validation flows (including deprecating cryptd validate and adding server-side default scenario handling).

Expands core gameplay mechanics for balance tuning (DES-028): armor damage reduction, consumable use_item support (engine + daemon tool + rules interpreter + loop events), and point-buy character creation with CON-based HP scaling, plus a new eval-balance monkey-test harness for running large parallel balance simulations.

^{Written by Cursor Bugbot for commit a0281b8. This will update automatically on new commits. Configure here.}

…ation (DES-027) Implement graph-first scenario generation to replace error-prone manual YAML authoring. The new crypt-admin binary generates scenarios from topology sources (filesystem trees), assigns bidirectional compass directions via BFS, decorates rooms via the visitor pattern, and exports to a new YAML directory format. New packages and files: - internal/scengen/: Graph types, TopologySource interface, TreeSource adapter with hub insertion for >5 children, Visitor/DescriptionVisitor, YAML directory exporter, SQLite store for iterative authoring - internal/scenario/loaddir.go: directory-format scenario loader with path traversal protection and cross-region duplicate room detection - cmd/crypt-admin/: generate, validate, export subcommands Modified: - internal/scenariodir: directory format takes precedence over single-file - cmd/cryptd: validate prints deprecation warning, rejects directories - Makefile: build-admin target, build/clean updated for three binaries - DESIGN.md: DES-027 ADR with SQLite justification - CHANGELOG.md, README.md: document new binary and features Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

internal/scengen/tree.go

internal/scengen/store.go

Copilot

Pull request overview

Adds a new authoring workflow centered on graph-first scenario generation (DES-027), introducing a crypt-admin binary and the supporting internal/scengen package, plus a new directory-based scenario format loader/exporter.

Changes:

Introduces crypt-admin (generate, validate, export) and deprecates cryptd validate.
Adds internal/scengen graph model, topology sources (filesystem tree), visitors, YAML directory exporter, and SQLite store.
Adds directory-format scenario loading (scenario.LoadDir) and updates scenariodir.Load to prefer directory format over legacy single-file YAML.

Reviewed changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 14 comments.

Show a summary per file

File	Description
cmd/crypt-admin/main.go	New authoring CLI for generate/validate/export workflows.
cmd/cryptd/main.go	Deprecation warning + blocks validating directory-format scenarios.
internal/scengen/generator.go	Topology→graph pipeline and direction assignment logic.
internal/scengen/generator_test.go	Tests for generation, direction assignment, and tree source integration.
internal/scengen/graph.go	Core graph types, invariants, and validation.
internal/scengen/graph_test.go	Unit tests for graph behavior and validation.
internal/scengen/topology.go	TopologySource interface + raw node/edge types.
internal/scengen/tree.go	Filesystem tree topology source + hub insertion.
internal/scengen/tree_test.go	Tests for tree walking, hub insertion, and ID sanitization/collisions.
internal/scengen/visitor.go	Visitor interface + DescriptionVisitor for room content seeds.
internal/scengen/visitor_test.go	Tests for DescriptionVisitor behavior.
internal/scengen/export.go	YAML directory exporter (manifest + region files).
internal/scengen/export_test.go	Round-trip tests: Generate/Export → LoadDir → Validate.
internal/scengen/store.go	SQLite persistence for iterative authoring (crypt-admin only).
internal/scengen/store_test.go	Store round-trip + FK/CHECK constraint tests.
internal/scenario/loaddir.go	Directory-format scenario loader with traversal checks + duplicate room detection.
internal/scenario/loaddir_test.go	Tests for LoadDir (valid, traversal, duplicate room).
internal/scenariodir/scenariodir.go	Scenario ID resolution updated to prefer directory format.
internal/scenariodir/scenariodir_test.go	Tests for precedence, traversal rejection, directory loading.
go.mod	Adds modernc SQLite driver and related indirect deps.
go.sum	Dependency hash updates for new module additions.
Makefile	Builds third binary (`crypt-admin`) and cleans it.
README.md	Documents three binaries and crypt-admin usage.
DESIGN.md	Adds DES-027 ADR describing graph-first generation and formats.
CHANGELOG.md	Changelog entries for crypt-admin, directory format, and deprecation.
.beads/issues.jsonl	Updates roadmap/issues entries related to scenario tooling.

internal/scengen/generator.go

internal/scengen/graph.go

internal/scengen/tree.go

internal/scengen/tree_test.go

internal/scengen/export.go

internal/scenario/loaddir.go

internal/scenariodir/scenariodir.go

internal/scengen/visitor.go

…ame systems Expands README from a brief overview to full project documentation: - All three binaries with complete flag reference and usage examples - Architecture section with dependency diagram and package map (18 packages) - Both scenario formats (single-file and directory) with examples - MCP tool surface (15 tools) - Game systems table (movement, combat, inventory, spells, leveling, save/load) - Build, test, and demo instructions with make targets - Documentation map linking all design and architecture docs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

internal/scengen/generator.go

…ce tuning Add a parallel monkey-testing harness that runs N game sessions with weighted-random action selection and collects metrics for balance tuning. The MonkeyRenderer implements the Renderer interface and plugs directly into the game loop — no network, no SLM, runs 1000s of sessions/sec. Action weights vary by game state: in combat (attack 70% when healthy, flee 40% when low HP), auto-equip weapons, prioritize item pickup. Per-session metrics: moves, rooms visited, enemies killed, XP, level, damage dealt/taken, survival, flee attempts/successes, spells cast. Aggregate report: survival rate, mean/median/p95 for all metrics, per-class breakdown when --class=all. New files: - internal/monkeytest/: metrics.go, monkey.go, runner.go + tests - cmd/eval-balance/main.go: CLI with --scenario, --players, --max-moves, --class, --workers, --seed, --verbose flags - Makefile: eval-balance and eval-balance-quick targets Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…lance Engine changes: - Armor defense: equipped armor's `defense` field reduces incoming damage (flat subtraction, floor at 1). New applyDefenses() consolidates defend stance + armor into one call in ProcessEnemyTurn. - Consumable items: new UseItem() engine method consumes items with effect/power fields (e.g. heal potions with dice-based HP restore). New typed error NotConsumableError. - ScenarioItem gains defense (int), power (dice string), effect (string). - model.Item mirrors the new fields for runtime use. Interpreter: "use", "drink", "eat" verbs parse to action type "use". Narrator: templates for "used_item" and "not_consumable" events. Game loop: dispatch for "use" action with error routing. Scenario rebalance (unix-catacombs): - OOM Killer: 25 HP → 15 HP, 1d8+2 → 1d6+1 - Segfault Daemon: 15 HP → 12 HP - New enemy: Zombie Process (6 HP, 1d3) in /etc/shadow - Weapon in starting room (short_sword 1d6) - Alias Shield: now has defense: 2 (was cosmetic) - 3 health potions placed along the path (2d6, 1d6+2, 3d6) Monkey test metrics: LeveledUp, LevelsGained, PotionsUsed tracked. Monkey strategy: auto-equips armor, uses potions at ≤50% HP. Balance results (500 sessions, all classes): - Before: 0% survival, mean XP 4.2, 0% level-up - After: 65% survival, mean XP 27.7, 66% level-up Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

internal/scengen/tree.go

internal/scengen/export.go

internal/monkeytest/monkey.go

HP per level is now base class amount + CON modifier ((CON-10)/2). Stats are applied before HP so the newly increased CON contributes immediately. Floor at 1 HP gain minimum. Effect: fighters (CON growth) gain 9-10+ HP/level at higher levels. Mages (no CON growth, modifier 0) gain base 4 HP throughout. This differentiates fighter tankiness from caster fragility as levels scale. New: docs/gameplay.tex — 9-page LaTeX specification covering all game mechanics: attributes, classes, XP tables, combat (initiative, attack, defend, flee, armor reduction), spells, inventory, consumables, movement, commands, save/load, scenario format, balance tuning targets, and explicitly listed unimplemented mechanics. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

internal/engine/leveling.go

Add NewCharacter() and ValidateStats() to centralize character creation. Players distribute 8 points across 6 stats (base 10 each, minimum 10). The old hardcoded STR:14 DEX:12 CON:12 is the default when no stats are provided. - engine.NewCharacter(name, class, stats): validates class and stats, returns a ready-to-play Character. Nil stats uses defaults. - engine.ValidateStats(stats): checks all >= 10, total points == 8. - Daemon new_game RPC: accepts optional "stats" field in arguments. - serve.go testing mode: uses NewCharacter with defaults. - Monkey tester: class-optimal stat distributions (fighter: STR/CON 14, thief: DEX 14/STR 12/CON 12, mage: INT 14/WIS 12/DEX 12, etc.) Balance impact: class survival spread narrowed from 15pp to 3pp (63-66%) because each class now starts with stats that match their role instead of uniform STR-heavy allocation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Update gameplay.tex to reflect commits 5-6: - Point-buy stat allocation section: 8-point pool, base 10, validation rules, example builds table (tank/agile/scholarly/devout) - Stat modifier formula and reference table (stat 8-20 → modifier -1 to +5) - CON → HP per level documented with Go truncation note - Attribute table updated: DEX and CON mechanical effects specified - Starting state table: stats now "player-configured via point-buy" - Unimplemented section: per-class defaults noted as guidance feature Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

internal/monkeytest/monkey.go

In interactive mode (cryptd serve -t without --script), a character creation prompt now asks for name, class, and stat allocation before the game begins. The player enters 6 numbers representing bonus points to add to each stat (e.g. "4 0 2 0 2 0" for STR 14, CON 12, DEX 12). Pressing Enter uses defaults (STR +4, DEX +2, CON +2). Scripted mode (--script) skips the prompt and uses flag defaults. Updated gameplay.tex to document the interactive creation flow. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

internal/monkeytest/monkey.go

Interactive mode (cryptd serve -t without --script) now probes for ollama and uses SLM interpreter + narrator when available, falling back to rules+templates if no inference server is found. Previously, testing mode always used rules+templates regardless of ollama availability. Scripted mode (--script) retains deterministic rules+templates for reproducibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…tests Engine coverage dropped from 90% to 82% after adding UseItem, applyDefenses, NewCharacter, and ValidateStats without tests. New test files: - use_item_test.go: heal potion, MaxHP cap, not-consumable, not-in-inventory - character_test.go: NewCharacter defaults/custom/invalid, ValidateStats too many/few/below-minimum points, DefaultStats pool validation - armor_test.go: armor reduces damage to floor 1, defend+armor stacks, no armor = full damage Coverage: 82% → 92% Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

CHANGELOG: add entries for armor damage reduction, consumable items, CON modifier HP scaling, point-buy stat allocation, interactive character creation, SLM fix for -t mode, scenario rebalance, and eval-balance monkey test harness with level-up/potion tracking. README: add cmd/eval-balance and internal/monkeytest to package map, expand game systems table (character creation, armor, consumables, CON scaling), add balance testing section with make targets, add docs/gameplay.pdf to documentation table, update DES reference to 028. DESIGN.md: add DES-028 (Game Balance Mechanics) covering armor defense, consumables, CON modifier, point-buy stats. Documents alternatives rejected, balance results (0%→65% survival), and all affected files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ay.tex note - Rename from "Dungeon" to "Crypt" throughout - Component map: add cryptd/crypt/crypt-admin binaries, eval-balance, internal/monkeytest, internal/scengen packages - Remove stale mcp-proxy entry (design-stage, never built) - Add note directing readers to gameplay.tex for authoritative mechanics - Version bump to v0.3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

internal/monkeytest/runner.go

Man pages (roff format) for all four commands: - cryptd(1): server with modes, flags, character creation, MCP tools - crypt(1): thin client with game commands table, play environments - crypt-admin(1): scenario authoring with graph construction, examples - eval-balance(1): monkey tester with strategy, metrics, balance targets Makefile targets: - make install: builds binaries, installs to PREFIX/bin and man pages to PREFIX/share/man/man1 (default PREFIX=/usr/local) - make uninstall: removes installed binaries and man pages - make man: view cryptd(1) locally without installing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add WithDefaultScenario() server option so cryptd serve --scenario <id> sets a default scenario for new_game calls that omit scenario_id. The client can still override by passing its own scenario_id. New Makefile targets: - make play: builds both binaries, starts cryptd serve -f with minimal scenario, connects crypt client, cleans up server on exit - make play-unix: same flow with unix-catacombs scenario Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- NewCharacter applies CON modifier to starting HP (base 20 + modifier). CON 12 (default) → HP 21. CON 14 (tank build) → HP 22. - Hub insertion reserves one direction slot in the first batch for the hub link to overflow children, preventing degree > 6 on non-root nodes. - Remove local max() function that shadowed Go 1.21+ builtin. - requireFlagValue rejects flag-like values (--flag as value for --flag). - Document applyDefenses order: defend halves first, armor subtracts second (intentional — defend is already powerful). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- UseItem rejects heal consumables with empty Power field instead of silently healing 0 and consuming the item. - Wire use_item into daemon dispatcher so consumables work via JSON-RPC (was only reachable through the game loop, not the MCP tool surface). - classStats returns default stats for unknown classes instead of nil. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Map NotConsumableError to CodeInvalidParams in engineError (was falling through to CodeInternalError, signaling server bug for client input error). - Add dispatcher tests for use_item: heal potion success and not-consumable error (ToolResult isError=true with weapon type). - Tighten hub insertion test bound from maxChildrenPerNode+1 to maxChildrenPerNode (first batch now reserves a slot). - Add health_potion to minimal test scenario for use_item coverage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- LoadGraph validates the loaded graph before returning, catching corrupt/manually-edited SQLite data at load time instead of letting broken invariants propagate downstream. - runExport self-validates the exported scenario via LoadDir (mirrors the safety net already in runGenerate). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- SaveGraph: skip "start" key in Meta loop to prevent PK violation when g.Meta contains a "start" entry (Bugbot). - itemYAML: add Defense, Power, Effect fields so armor and consumable properties survive the generate→export→load round-trip (Bugbot). - Monkey metrics: only count kills on combat-end when room is Cleared (victory), not on flee. Prevents inflated EnemiesKilled/DamageDealt that would mislead balance tuning (Bugbot). - loadScenario: use scenariodir.Load for correct format precedence (directory first, then single-file). Was reversed (Bugbot). - DescriptionVisitor comment: "6 children" → "maxChildrenPerNode" to match actual threshold (Copilot). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

internal/daemon/dispatcher.go

Copilot AI review requested due to automatic review settings March 15, 2026 21:25

Copilot started reviewing on behalf of jmf-pobox March 15, 2026 21:25 View session

cursor bot reviewed Mar 15, 2026

View reviewed changes

internal/scengen/tree.go Show resolved Hide resolved

internal/scengen/store.go Show resolved Hide resolved

Copilot AI reviewed Mar 15, 2026

View reviewed changes

cursor bot reviewed Mar 15, 2026

View reviewed changes

internal/scengen/generator.go Show resolved Hide resolved

jmf-pobox and others added 2 commits March 15, 2026 17:00