Conversation
Co-authored-by: OpenAI Codex <codex@openai.com>
|
Claude finished @lewtun's task in 4m 6s —— View job PR Review: Add CLI local model support
0 P0, 2 P1 What I checked
P1 — Docstring mismatch on transient-error behavior
This is true for the non-local path ( P1 — No test for probe failure path
"Keeps current model on hard error" is an explicitly advertised behavior in the PR description and is worth a second async test. Example sketch: @pytest.mark.asyncio
async def test_probe_and_switch_local_model_rejects_on_connection_error(monkeypatch):
async def failing_acompletion(**kwargs):
raise ConnectionRefusedError("no server")
monkeypatch.setattr(model_switcher, "acompletion", failing_acompletion)
class Config:
model_name = "openai/gpt-5.5"
reasoning_effort = None
session = Session()
await model_switcher.probe_and_switch_model(
"ollama/llama3.1:8b", Config(), session, Console(), hf_token=None
)
assert session.model_id is None # no switch
assert "ollama/llama3.1:8b" not in session.model_effective_effortP2 — Redundant triple validation in
|
Resolved README usage docs conflict by keeping Claude 4.7 guidance from main and local model examples from this branch. Co-authored-by: OpenAI Codex <codex@openai.com>
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
|
Claude finished @lewtun's task in 2m 11s —— View job PR Re-Review: Add CLI local model support0 P0 This is a re-trigger after merges from What I checked
No new P0 issues. Per REVIEW.md re-review convergence rules, P1s previously flagged (transient-error docstring mismatch, missing probe-failure test) are not re-flagged here — those are deferrable at the author's discretion. |
Support LOCAL_LLM_BASE_URL and LOCAL_LLM_API_KEY as shared fallbacks while preserving provider-specific local overrides. Co-authored-by: OpenAI Codex <codex@openai.com>
…tern into codex/cli-local-models
|
Claude finished @lewtun's task in 1m 22s —— View job PR Re-Review: Add CLI local model support0 P0 This re-review covers commit What I checked
No new P0 issues. Per REVIEW.md re-review convergence rules, the prior P1s (transient-error docstring mismatch in |
Clarify local probe failure behavior, add regression coverage for rejected local switches, and simplify local model validation. Co-authored-by: OpenAI Codex <codex@openai.com>
|
Addressed the review feedback from #228 (comment) in
Validation: uv run pytest tests/unit/test_llm_params.py tests/unit/test_cli_local_models.py
uv run pytest tests/unit
uv run ruff check .
uv run ruff format --check . |
|
Claude finished @lewtun's task in 1m 37s —— View job PR Re-Review: Add CLI local model support0 P0 This re-review covers commit What I checked
No new P0 issues. Per REVIEW.md re-review convergence rules, the prior P1s have been resolved in this commit; nothing else to flag. |
|
Claude finished @lewtun's task in 33s —— View job PR Re-Review: Add CLI local model support0 P0 This re-review covers the merge commit What I checked
No new findings. Per REVIEW.md re-review convergence rules, the prior P1s were resolved in |
Summary
Adds CLI-only support for local OpenAI-compatible model servers through provider-specific model IDs:
ollama/<model>vllm/<model>lm_studio/<model>llamacpp/<model>This keeps the web/backend model catalog unchanged and intentionally does not add the generic
openai-compat/escape hatch yet.Background / lineage
This PR takes the narrow CLI routing path from the local-model discussion and builds on ideas from prior PRs:
It addresses the CLI slice of #94 without adding frontend/API local model selection.
Usage examples
Start the local inference server first, then select it with a provider-specific prefix:
In an interactive session:
Use one shared local endpoint for any local prefix:
LOCAL_LLM_BASE_URL=http://localhost:8000 LOCAL_LLM_API_KEY=optional-local-key ml-intern --model vllm/custom-model "inspect the training script"Or override a specific provider when running multiple local servers:
Base URLs may include or omit
/v1. Local API keys are optional; when unset, local backends get a non-empty placeholder key by default.Local endpoint precedence:
OPENAI_BASE_URLis intentionally left foropenai/...models only. For example:Implementation notes
openai/<model>with the configured provider base URL.reasoning_effort; strict mode raisesUnsupportedEffortError, while normal runtime drops it./modelskips HF Router catalog lookup for local IDs and performs a cheap no-effort probe before switching.LOCAL_LLM_*vars, andOPENAI_BASE_URLis not reused for local prefixes.Validation