Skip to content

Feature request: Allow routing to local OpenAI-compatible backends (Ollama / vLLM / LM Studio) #67

@Ban921

Description

@Ban921

Summary

Today agent/core/llm_params.py::_resolve_llm_params hard-routes every
model id into one of three branches:

  • anthropic/... → Anthropic API
  • openai/... → OpenAI API (no api_base override)
  • everything else → forced onto https://router.huggingface.co/v1
    with an HF token

There's no escape hatch for a local OpenAI-compatible server (Ollama,
vLLM, LM Studio, llama.cpp's server, etc.), even though litellm
itself supports all of them out of the box.

Motivation

Smaller local models almost certainly won't match frontier-model
quality on this agent loop, and that's fine — the value is in opening
up more room for experimentation:

  • Cheap / unlimited iteration without per-token costs
  • Offline use and stronger privacy for sensitive datasets / papers
  • A testbed for the community to benchmark open-weights models on the
    same agent harness
  • Lower barrier to entry for students and hobbyists who want to study
    the agent end-to-end

Suggested shape

Something minimally invasive, e.g.:

  • recognize an ollama/... prefix and pass it through to litellm, or
  • honor an optional api_base field in main_agent_config.json /
    LITELLM_API_BASE env var on the "fallback" branch instead of
    hard-coding the HF router URL.

Related to #56, but specifically about local execution rather than
hosted open-source model APIs.

Happy to test or send a PR if this is on the roadmap.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions