Summary
Today agent/core/llm_params.py::_resolve_llm_params hard-routes every
model id into one of three branches:
anthropic/... → Anthropic API
openai/... → OpenAI API (no api_base override)
- everything else → forced onto
https://router.huggingface.co/v1
with an HF token
There's no escape hatch for a local OpenAI-compatible server (Ollama,
vLLM, LM Studio, llama.cpp's server, etc.), even though litellm
itself supports all of them out of the box.
Motivation
Smaller local models almost certainly won't match frontier-model
quality on this agent loop, and that's fine — the value is in opening
up more room for experimentation:
- Cheap / unlimited iteration without per-token costs
- Offline use and stronger privacy for sensitive datasets / papers
- A testbed for the community to benchmark open-weights models on the
same agent harness
- Lower barrier to entry for students and hobbyists who want to study
the agent end-to-end
Suggested shape
Something minimally invasive, e.g.:
- recognize an
ollama/... prefix and pass it through to litellm, or
- honor an optional
api_base field in main_agent_config.json /
LITELLM_API_BASE env var on the "fallback" branch instead of
hard-coding the HF router URL.
Related to #56, but specifically about local execution rather than
hosted open-source model APIs.
Happy to test or send a PR if this is on the roadmap.
Summary
Today
agent/core/llm_params.py::_resolve_llm_paramshard-routes everymodel id into one of three branches:
anthropic/...→ Anthropic APIopenai/...→ OpenAI API (noapi_baseoverride)https://router.huggingface.co/v1with an HF token
There's no escape hatch for a local OpenAI-compatible server (Ollama,
vLLM, LM Studio, llama.cpp's server, etc.), even though
litellmitself supports all of them out of the box.
Motivation
Smaller local models almost certainly won't match frontier-model
quality on this agent loop, and that's fine — the value is in opening
up more room for experimentation:
same agent harness
the agent end-to-end
Suggested shape
Something minimally invasive, e.g.:
ollama/...prefix and pass it through to litellm, orapi_basefield inmain_agent_config.json/LITELLM_API_BASEenv var on the "fallback" branch instead ofhard-coding the HF router URL.
Related to #56, but specifically about local execution rather than
hosted open-source model APIs.
Happy to test or send a PR if this is on the roadmap.