Skip to content

feat: extract deterministic operations into shell scripts (ADR-009 implementation) #8

@sgbett

Description

@sgbett

Context

Issues #4/#5 and #6/#7 both attempted to fix reliability problems by correcting the instructions that the LLM interprets at runtime. This approach is fundamentally flawed: fixing text in a reference document that an LLM reads and interprets doesn't reliably fix behaviour. The LLM may improvise the commands, read stale copies, or deviate from the documented procedure — which is exactly what happened.

Evidence:

  • PR fix: correct copy path in setup-architect installation #5 fixed the copy path in installation-procedures.md. The installation subsequently failed in multiple sessions, only succeeding once when the right context happened to be present. The fix corrected the text but the LLM still improvised the actual command.
  • PR feat: Add date-based ADR numbering format #7 added date-based ADR numbering as a config option documented in create-adr/SKILL.md. In practice, the LLM ignored the config, used arbitrary formats, or applied the wrong format — because reading a config value and applying a format string is a deterministic operation being handled by a non-deterministic system.

The Real Problem

ADR-009 (Script-Based Deterministic Operations) already identified this class of problem and proposed shell scripts as the solution. However, it deferred implementation because:

  1. Necessity was scored 3/10 — but the candidate set was incomplete. It listed ADR numbering, member parsing, version validation, filename sanitisation, and file scanning — all low-stakes, cosmetic operations. It missed the installation procedure entirely — the highest-stakes deterministic operation in the framework (destructive, path-sensitive, multi-step with rm -rf).

  2. Complexity was scored 6/10 — partly due to "development overhead: writing and testing scripts takes longer than inline instructions." This assumed manual development effort. With LLM assistance, the infrastructure cost is negligible.

With the correct candidate set and realistic complexity assessment, the decision flips from "defer" to "implement now."

Trust erosion

Beyond the technical failure modes, there's a user trust dimension that the severity assessment missed. Incorrect ADR numbering is technically trivial to fix (rename the file), but the consequence is: "If this thing can't count right, what else can't it do?" People adopting an architecture framework are self-selecting for caring about correctness — they're the worst possible audience to show unreliable numbering to. Configuration options that are ignored are worse still: the user explicitly set a preference and the system disregarded it.

Proposed Solution

Split each skill into deterministic shell scripts (for file operations, numbering, config reading) and interpretive skill instructions (for project analysis, customisation, synthesis).

Scripts to create

Script Skill What it replaces
install-framework.sh setup-architect All file copy/mkdir/cleanup operations including the 5-safeguard .git removal
next-adr-number.sh create-adr ADR prefix generation from config (sequential with padding, date-based with format string)

ADR numbering config (supersedes #6/#7)

The current numbering_format: date-based option tells the LLM what to do but not how. Replace with deterministic config:

numbering_format: sequential          # "sequential" or "date-based"
# sequential_format: "000"            # zero-padding width (default: 000 = 3 digits)
# date_format: "%Y%m%d"              # strftime format for date-based (default: %Y%m%d)
  • sequential_format specifies padding width by counting zeros: 000 = 3 digits, 0000 = 4 digits
  • date_format is a literal date format string: %Y%m%d20260210, %Y-%m-%d2026-02-10
  • Backward compatible: existing numbering_format values work with sensible defaults
  • Commented-out options are self-documenting

CI tests

Add a script-tests job to the existing claude-code-tests.yml workflow that tests the actual scripts (not simulations of what the LLM might do):

  • Installation against all project type fixtures
  • Sequential and date-based ADR numbering
  • Failure cases (missing clone, collisions, bad paths)
  • Config defaults and custom formats

Supersedes

References


🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions