-
Notifications
You must be signed in to change notification settings - Fork 104
Model routing - map effort levels to different models #24
Description
Use Case
Phantom uses a single global model field in phantom.yaml. Every session uses the same model regardless of task complexity -- a quick status question costs the same as a multi-file refactor.
This is especially noticeable because Phantom's Claude Code SDK preset adds roughly 350k tokens of baseline per session. Most of that cost is justified for complex coding or reasoning tasks. For short conversational exchanges, it is not.
The effort field already exists as the agent's signal for how much work a task requires (low / medium / high / max). Model selection is the natural next step: let the effort level also determine which model is used.
Proposed Config
New optional models block in phantom.yaml:
# phantom.yaml
# Existing field -- still used as fallback when models: is absent.
# Backward compatible: if you don't add models:, nothing changes.
model: claude-sonnet-4-6
# Optional: map each effort level to a specific model.
models:
low: claude-haiku-4-5
medium: claude-haiku-4-5
high: claude-sonnet-4-6
max: claude-opus-4-6Behavior:
- If
modelsis present: the model is selected per-session by looking up the currenteffortvalue in the map. - If
modelsis absent: existing single-model behavior unchanged. - Partial maps are valid: unspecified effort levels fall back to the global
modelfield.
Implementation Notes
The change is small and confined to two files:
src/config/schemas.ts-- extendPhantomConfigSchemawith an optionalmodelsrecord fieldsrc/agent/runtime.ts-- at session start, resolve the model: checkmodels[effort], fall back tomodel
No changes to the agent, prompts, or evolution pipeline. This is pure TypeScript plumbing -- it does not violate the Cardinal Rule.
Example Real-World Config
model: claude-sonnet-4-6 # fallback for any effort level not listed
models:
low: claude-haiku-4-5 # quick questions, status checks
medium: claude-haiku-4-5 # light tasks
high: claude-sonnet-4-6 # coding, analysis
max: claude-opus-4-6 # complex architectural decisionsThis alone would cut costs significantly for deployments where most interactions are short conversational exchanges.