Skip to content

Model routing - map effort levels to different models #24

@arne1101

Description

@arne1101

Use Case

Phantom uses a single global model field in phantom.yaml. Every session uses the same model regardless of task complexity -- a quick status question costs the same as a multi-file refactor.

This is especially noticeable because Phantom's Claude Code SDK preset adds roughly 350k tokens of baseline per session. Most of that cost is justified for complex coding or reasoning tasks. For short conversational exchanges, it is not.

The effort field already exists as the agent's signal for how much work a task requires (low / medium / high / max). Model selection is the natural next step: let the effort level also determine which model is used.

Proposed Config

New optional models block in phantom.yaml:

# phantom.yaml

# Existing field -- still used as fallback when models: is absent.
# Backward compatible: if you don't add models:, nothing changes.
model: claude-sonnet-4-6

# Optional: map each effort level to a specific model.
models:
  low: claude-haiku-4-5
  medium: claude-haiku-4-5
  high: claude-sonnet-4-6
  max: claude-opus-4-6

Behavior:

  • If models is present: the model is selected per-session by looking up the current effort value in the map.
  • If models is absent: existing single-model behavior unchanged.
  • Partial maps are valid: unspecified effort levels fall back to the global model field.

Implementation Notes

The change is small and confined to two files:

  • src/config/schemas.ts -- extend PhantomConfigSchema with an optional models record field
  • src/agent/runtime.ts -- at session start, resolve the model: check models[effort], fall back to model

No changes to the agent, prompts, or evolution pipeline. This is pure TypeScript plumbing -- it does not violate the Cardinal Rule.

Example Real-World Config

model: claude-sonnet-4-6   # fallback for any effort level not listed
models:
  low: claude-haiku-4-5    # quick questions, status checks
  medium: claude-haiku-4-5 # light tasks
  high: claude-sonnet-4-6  # coding, analysis
  max: claude-opus-4-6     # complex architectural decisions

This alone would cut costs significantly for deployments where most interactions are short conversational exchanges.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions