feat: prompt cache optimization — structure context for cache hits

## Context

Claude Code splits system prompts into static (globally cacheable) and dynamic (session-scoped) content using a boundary marker. Static content hits prompt cache = ~90% token savings on the system prompt portion.

**Our agents rebuild the entire prompt from scratch each execution, mixing static and dynamic content.** This means zero prompt cache benefit across agents or across turns within a conversation.

## From Claude Code Source

```typescript
// prompts.ts
const SYSTEM_PROMPT_DYNAMIC_BOUNDARY = '__SYSTEM_PROMPT_DYNAMIC_BOUNDARY__'

// Static sections (before boundary) — cached globally:
// - Identity, rules, tool usage guidelines, tone, output style
// Dynamic sections (after boundary) — per-session:
// - Git status, CLAUDE.md, memory, environment info

// utils/api.ts — splitSysPromptPrefix()
// Assigns cache_control: { type: 'ephemeral', scope: 'global' } to static blocks
```

## How Our Context Flows Today

```
agent-runner.ts builds prompt:
  "You are {agent} from squad {squad}."
  + taskDirective (dynamic)
  + SYSTEM.md (static — same for all agents)
  + squadContext (mixed — company.md is static, state.md is dynamic)
  + cognitionContext (dynamic)
```

All of this goes into the `--` prompt argument to Claude, which becomes the first user message. Claude Code's own system prompt (from CLAUDE.md files) IS cached, but our agent context is NOT.

## Proposed Architecture

Split our context into two injection points:

### 1. Static context → Project CLAUDE.md (cached by Claude Code)

Move SYSTEM.md and company.md into a project-level `.claude/CLAUDE.md` that Claude Code loads and caches automatically:

```markdown
# .claude/CLAUDE.md (in agents repo)


## System Protocol
{contents of SYSTEM.md}

## Company
{contents of company.md}
```

### 2. Dynamic context → `--` prompt argument (per-execution)

Keep per-agent, per-run context in the prompt:

```
"You are {agent} from squad {squad}."
+ goals.md, state.md, feedback.md, priorities.md
+ taskDirective
```

### Expected savings

SYSTEM.md (~2K tokens) + company.md (~500 tokens) = ~2.5K tokens cached across ALL agent executions. With prompt cache pricing at 90% discount, that's significant at scale.

### Implementation

1. Generate `.claude/CLAUDE.md` from SYSTEM.md + company.md during `squads init` or as a build step
2. Or: restructure so SYSTEM.md IS the project CLAUDE.md (simplest)
3. Strip static content from `gatherSquadContext()` so it's not double-injected
4. Verify Claude Code is loading the project CLAUDE.md (check `claudemd.ts` discovery paths)

### Files to change

- `src/lib/run-context.ts` — remove SYSTEM.md and company.md from prompt injection
- `.agents/SYSTEM.md` → `.claude/CLAUDE.md` migration (or symlink)
- `src/lib/agent-runner.ts` — stop passing systemContext separately

## Risks

- Claude Code's CLAUDE.md loading has specific rules (frontmatter, @include, etc.). Need to validate compatibility.
- Project CLAUDE.md is shared across all agents — can't include agent-specific instructions there. This is the right constraint (static = shared).
- Existing users with custom `.claude/CLAUDE.md` would get our content injected. Need to merge, not overwrite.

## Open questions

- Does Claude Code's `--print` mode respect project CLAUDE.md? Need to verify.
- Can we use `@include` directives to keep SYSTEM.md and company.md as separate files but loaded via CLAUDE.md?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: prompt cache optimization — structure context for cache hits #703

Context

From Claude Code Source

How Our Context Flows Today

Proposed Architecture

1. Static context → Project CLAUDE.md (cached by Claude Code)

2. Dynamic context → `--` prompt argument (per-execution)

Expected savings

Implementation

Files to change

Risks

Open questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: prompt cache optimization — structure context for cache hits #703

Description

Context

From Claude Code Source

How Our Context Flows Today

Proposed Architecture

1. Static context → Project CLAUDE.md (cached by Claude Code)

2. Dynamic context → -- prompt argument (per-execution)

Expected savings

Implementation

Files to change

Risks

Open questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

2. Dynamic context → `--` prompt argument (per-execution)