feat(scengen): add crypt-admin binary with graph-first scenario generation (DES-027)#47
Merged
feat(scengen): add crypt-admin binary with graph-first scenario generation (DES-027)#47
Conversation
…ation (DES-027) Implement graph-first scenario generation to replace error-prone manual YAML authoring. The new crypt-admin binary generates scenarios from topology sources (filesystem trees), assigns bidirectional compass directions via BFS, decorates rooms via the visitor pattern, and exports to a new YAML directory format. New packages and files: - internal/scengen/: Graph types, TopologySource interface, TreeSource adapter with hub insertion for >5 children, Visitor/DescriptionVisitor, YAML directory exporter, SQLite store for iterative authoring - internal/scenario/loaddir.go: directory-format scenario loader with path traversal protection and cross-region duplicate room detection - cmd/crypt-admin/: generate, validate, export subcommands Modified: - internal/scenariodir: directory format takes precedence over single-file - cmd/cryptd: validate prints deprecation warning, rejects directories - Makefile: build-admin target, build/clean updated for three binaries - DESIGN.md: DES-027 ADR with SQLite justification - CHANGELOG.md, README.md: document new binary and features Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a new authoring workflow centered on graph-first scenario generation (DES-027), introducing a crypt-admin binary and the supporting internal/scengen package, plus a new directory-based scenario format loader/exporter.
Changes:
- Introduces
crypt-admin(generate,validate,export) and deprecatescryptd validate. - Adds
internal/scengengraph model, topology sources (filesystem tree), visitors, YAML directory exporter, and SQLite store. - Adds directory-format scenario loading (
scenario.LoadDir) and updatesscenariodir.Loadto prefer directory format over legacy single-file YAML.
Reviewed changes
Copilot reviewed 25 out of 26 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| cmd/crypt-admin/main.go | New authoring CLI for generate/validate/export workflows. |
| cmd/cryptd/main.go | Deprecation warning + blocks validating directory-format scenarios. |
| internal/scengen/generator.go | Topology→graph pipeline and direction assignment logic. |
| internal/scengen/generator_test.go | Tests for generation, direction assignment, and tree source integration. |
| internal/scengen/graph.go | Core graph types, invariants, and validation. |
| internal/scengen/graph_test.go | Unit tests for graph behavior and validation. |
| internal/scengen/topology.go | TopologySource interface + raw node/edge types. |
| internal/scengen/tree.go | Filesystem tree topology source + hub insertion. |
| internal/scengen/tree_test.go | Tests for tree walking, hub insertion, and ID sanitization/collisions. |
| internal/scengen/visitor.go | Visitor interface + DescriptionVisitor for room content seeds. |
| internal/scengen/visitor_test.go | Tests for DescriptionVisitor behavior. |
| internal/scengen/export.go | YAML directory exporter (manifest + region files). |
| internal/scengen/export_test.go | Round-trip tests: Generate/Export → LoadDir → Validate. |
| internal/scengen/store.go | SQLite persistence for iterative authoring (crypt-admin only). |
| internal/scengen/store_test.go | Store round-trip + FK/CHECK constraint tests. |
| internal/scenario/loaddir.go | Directory-format scenario loader with traversal checks + duplicate room detection. |
| internal/scenario/loaddir_test.go | Tests for LoadDir (valid, traversal, duplicate room). |
| internal/scenariodir/scenariodir.go | Scenario ID resolution updated to prefer directory format. |
| internal/scenariodir/scenariodir_test.go | Tests for precedence, traversal rejection, directory loading. |
| go.mod | Adds modernc SQLite driver and related indirect deps. |
| go.sum | Dependency hash updates for new module additions. |
| Makefile | Builds third binary (crypt-admin) and cleans it. |
| README.md | Documents three binaries and crypt-admin usage. |
| DESIGN.md | Adds DES-027 ADR describing graph-first generation and formats. |
| CHANGELOG.md | Changelog entries for crypt-admin, directory format, and deprecation. |
| .beads/issues.jsonl | Updates roadmap/issues entries related to scenario tooling. |
…ame systems Expands README from a brief overview to full project documentation: - All three binaries with complete flag reference and usage examples - Architecture section with dependency diagram and package map (18 packages) - Both scenario formats (single-file and directory) with examples - MCP tool surface (15 tools) - Game systems table (movement, combat, inventory, spells, leveling, save/load) - Build, test, and demo instructions with make targets - Documentation map linking all design and architecture docs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ce tuning Add a parallel monkey-testing harness that runs N game sessions with weighted-random action selection and collects metrics for balance tuning. The MonkeyRenderer implements the Renderer interface and plugs directly into the game loop — no network, no SLM, runs 1000s of sessions/sec. Action weights vary by game state: in combat (attack 70% when healthy, flee 40% when low HP), auto-equip weapons, prioritize item pickup. Per-session metrics: moves, rooms visited, enemies killed, XP, level, damage dealt/taken, survival, flee attempts/successes, spells cast. Aggregate report: survival rate, mean/median/p95 for all metrics, per-class breakdown when --class=all. New files: - internal/monkeytest/: metrics.go, monkey.go, runner.go + tests - cmd/eval-balance/main.go: CLI with --scenario, --players, --max-moves, --class, --workers, --seed, --verbose flags - Makefile: eval-balance and eval-balance-quick targets Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lance Engine changes: - Armor defense: equipped armor's `defense` field reduces incoming damage (flat subtraction, floor at 1). New applyDefenses() consolidates defend stance + armor into one call in ProcessEnemyTurn. - Consumable items: new UseItem() engine method consumes items with effect/power fields (e.g. heal potions with dice-based HP restore). New typed error NotConsumableError. - ScenarioItem gains defense (int), power (dice string), effect (string). - model.Item mirrors the new fields for runtime use. Interpreter: "use", "drink", "eat" verbs parse to action type "use". Narrator: templates for "used_item" and "not_consumable" events. Game loop: dispatch for "use" action with error routing. Scenario rebalance (unix-catacombs): - OOM Killer: 25 HP → 15 HP, 1d8+2 → 1d6+1 - Segfault Daemon: 15 HP → 12 HP - New enemy: Zombie Process (6 HP, 1d3) in /etc/shadow - Weapon in starting room (short_sword 1d6) - Alias Shield: now has defense: 2 (was cosmetic) - 3 health potions placed along the path (2d6, 1d6+2, 3d6) Monkey test metrics: LeveledUp, LevelsGained, PotionsUsed tracked. Monkey strategy: auto-equips armor, uses potions at ≤50% HP. Balance results (500 sessions, all classes): - Before: 0% survival, mean XP 4.2, 0% level-up - After: 65% survival, mean XP 27.7, 66% level-up Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
HP per level is now base class amount + CON modifier ((CON-10)/2). Stats are applied before HP so the newly increased CON contributes immediately. Floor at 1 HP gain minimum. Effect: fighters (CON growth) gain 9-10+ HP/level at higher levels. Mages (no CON growth, modifier 0) gain base 4 HP throughout. This differentiates fighter tankiness from caster fragility as levels scale. New: docs/gameplay.tex — 9-page LaTeX specification covering all game mechanics: attributes, classes, XP tables, combat (initiative, attack, defend, flee, armor reduction), spells, inventory, consumables, movement, commands, save/load, scenario format, balance tuning targets, and explicitly listed unimplemented mechanics. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add NewCharacter() and ValidateStats() to centralize character creation. Players distribute 8 points across 6 stats (base 10 each, minimum 10). The old hardcoded STR:14 DEX:12 CON:12 is the default when no stats are provided. - engine.NewCharacter(name, class, stats): validates class and stats, returns a ready-to-play Character. Nil stats uses defaults. - engine.ValidateStats(stats): checks all >= 10, total points == 8. - Daemon new_game RPC: accepts optional "stats" field in arguments. - serve.go testing mode: uses NewCharacter with defaults. - Monkey tester: class-optimal stat distributions (fighter: STR/CON 14, thief: DEX 14/STR 12/CON 12, mage: INT 14/WIS 12/DEX 12, etc.) Balance impact: class survival spread narrowed from 15pp to 3pp (63-66%) because each class now starts with stats that match their role instead of uniform STR-heavy allocation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update gameplay.tex to reflect commits 5-6: - Point-buy stat allocation section: 8-point pool, base 10, validation rules, example builds table (tank/agile/scholarly/devout) - Stat modifier formula and reference table (stat 8-20 → modifier -1 to +5) - CON → HP per level documented with Go truncation note - Attribute table updated: DEX and CON mechanical effects specified - Starting state table: stats now "player-configured via point-buy" - Unimplemented section: per-class defaults noted as guidance feature Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
In interactive mode (cryptd serve -t without --script), a character creation prompt now asks for name, class, and stat allocation before the game begins. The player enters 6 numbers representing bonus points to add to each stat (e.g. "4 0 2 0 2 0" for STR 14, CON 12, DEX 12). Pressing Enter uses defaults (STR +4, DEX +2, CON +2). Scripted mode (--script) skips the prompt and uses flag defaults. Updated gameplay.tex to document the interactive creation flow. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Interactive mode (cryptd serve -t without --script) now probes for ollama and uses SLM interpreter + narrator when available, falling back to rules+templates if no inference server is found. Previously, testing mode always used rules+templates regardless of ollama availability. Scripted mode (--script) retains deterministic rules+templates for reproducibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tests Engine coverage dropped from 90% to 82% after adding UseItem, applyDefenses, NewCharacter, and ValidateStats without tests. New test files: - use_item_test.go: heal potion, MaxHP cap, not-consumable, not-in-inventory - character_test.go: NewCharacter defaults/custom/invalid, ValidateStats too many/few/below-minimum points, DefaultStats pool validation - armor_test.go: armor reduces damage to floor 1, defend+armor stacks, no armor = full damage Coverage: 82% → 92% Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CHANGELOG: add entries for armor damage reduction, consumable items, CON modifier HP scaling, point-buy stat allocation, interactive character creation, SLM fix for -t mode, scenario rebalance, and eval-balance monkey test harness with level-up/potion tracking. README: add cmd/eval-balance and internal/monkeytest to package map, expand game systems table (character creation, armor, consumables, CON scaling), add balance testing section with make targets, add docs/gameplay.pdf to documentation table, update DES reference to 028. DESIGN.md: add DES-028 (Game Balance Mechanics) covering armor defense, consumables, CON modifier, point-buy stats. Documents alternatives rejected, balance results (0%→65% survival), and all affected files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ay.tex note - Rename from "Dungeon" to "Crypt" throughout - Component map: add cryptd/crypt/crypt-admin binaries, eval-balance, internal/monkeytest, internal/scengen packages - Remove stale mcp-proxy entry (design-stage, never built) - Add note directing readers to gameplay.tex for authoritative mechanics - Version bump to v0.3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Man pages (roff format) for all four commands: - cryptd(1): server with modes, flags, character creation, MCP tools - crypt(1): thin client with game commands table, play environments - crypt-admin(1): scenario authoring with graph construction, examples - eval-balance(1): monkey tester with strategy, metrics, balance targets Makefile targets: - make install: builds binaries, installs to PREFIX/bin and man pages to PREFIX/share/man/man1 (default PREFIX=/usr/local) - make uninstall: removes installed binaries and man pages - make man: view cryptd(1) locally without installing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add WithDefaultScenario() server option so cryptd serve --scenario <id> sets a default scenario for new_game calls that omit scenario_id. The client can still override by passing its own scenario_id. New Makefile targets: - make play: builds both binaries, starts cryptd serve -f with minimal scenario, connects crypt client, cleans up server on exit - make play-unix: same flow with unix-catacombs scenario Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- NewCharacter applies CON modifier to starting HP (base 20 + modifier). CON 12 (default) → HP 21. CON 14 (tank build) → HP 22. - Hub insertion reserves one direction slot in the first batch for the hub link to overflow children, preventing degree > 6 on non-root nodes. - Remove local max() function that shadowed Go 1.21+ builtin. - requireFlagValue rejects flag-like values (--flag as value for --flag). - Document applyDefenses order: defend halves first, armor subtracts second (intentional — defend is already powerful). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- UseItem rejects heal consumables with empty Power field instead of silently healing 0 and consuming the item. - Wire use_item into daemon dispatcher so consumables work via JSON-RPC (was only reachable through the game loop, not the MCP tool surface). - classStats returns default stats for unknown classes instead of nil. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Map NotConsumableError to CodeInvalidParams in engineError (was falling through to CodeInternalError, signaling server bug for client input error). - Add dispatcher tests for use_item: heal potion success and not-consumable error (ToolResult isError=true with weapon type). - Tighten hub insertion test bound from maxChildrenPerNode+1 to maxChildrenPerNode (first batch now reserves a slot). - Add health_potion to minimal test scenario for use_item coverage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- LoadGraph validates the loaded graph before returning, catching corrupt/manually-edited SQLite data at load time instead of letting broken invariants propagate downstream. - runExport self-validates the exported scenario via LoadDir (mirrors the safety net already in runGenerate). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- SaveGraph: skip "start" key in Meta loop to prevent PK violation when g.Meta contains a "start" entry (Bugbot). - itemYAML: add Defense, Power, Effect fields so armor and consumable properties survive the generate→export→load round-trip (Bugbot). - Monkey metrics: only count kills on combat-end when room is Cleared (victory), not on flee. Prevents inflated EnemiesKilled/DamageDealt that would mislead balance tuning (Bugbot). - loadScenario: use scenariodir.Load for correct format precedence (directory first, then single-file). Was reversed (Bugbot). - DescriptionVisitor comment: "6 children" → "maxChildrenPerNode" to match actual threshold (Copilot). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
crypt-adminbinary withgenerate,validate, andexportsubcommands for graph-first scenario authoring (DES-027)internal/scengen/package: Graph types with 6-direction compass, TopologySource interface, TreeSource filesystem adapter (with hub insertion for >5 children), Visitor pattern for content decoration, YAML directory exporter, SQLite store for iterative authoringscenariodir.Load()tries directory format first, falls back to single-fileLoadDir, ID collision detection in TreeSource, atomic temp-dir writes in exporter, explicit flag value validation in CLIContext
Hand-authoring a 200-room scenario in YAML produced 83 broken one-way connections (PR #45). Root cause: YAML embeds edges inside nodes, so each connection must be authored twice. Graph-first generation produces valid bidirectional connections by construction.
Test plan
go test -race ./internal/scengen/... ./internal/scenario/... ./internal/scenariodir/...)crypt-admin generate --topology tree→crypt-admin validatego list -deps ./cmd/cryptd | grep sqlitereturns emptymake checkpasses (vet + test + lint + markdownlint on changed files)🤖 Generated with Claude Code
Note
High Risk
Large, cross-cutting change that adds a new authoring binary with SQLite dependency, introduces a new directory-based scenario format, and modifies core engine combat/leveling/item logic; regressions could affect gameplay balance, scenario loading, and daemon tool compatibility.
Overview
Introduces a new
crypt-adminbinary (generate,validate,export) for graph-first scenario authoring (DES-027), including optional SQLite-backed working copies and YAML directory export/validation.Adds support for a directory-based scenario format (manifest +
regions/*.yaml) and updates scenario resolution/validation flows (including deprecatingcryptd validateand adding server-side default scenario handling).Expands core gameplay mechanics for balance tuning (DES-028): armor damage reduction, consumable
use_itemsupport (engine + daemon tool + rules interpreter + loop events), and point-buy character creation with CON-based HP scaling, plus a neweval-balancemonkey-test harness for running large parallel balance simulations.Written by Cursor Bugbot for commit a0281b8. This will update automatically on new commits. Configure here.