Custom OpenAI-Compatible Provider Endpoints (BYO Endpoint for Local & Remote LLMs)

### Pre-submit Checks

- [x] I have [searched Warp feature requests](https://github.com/warpdotdev/warp/issues?q=is%3Aissue+label%3AFEATURE) and there are no duplicates
- [x] I have [searched Warp docs](https://docs.warp.dev) and my feature is not there

### Describe the solution you'd like?

Add support for configuring custom OpenAI-compatible API endpoints, allowing Warp to connect to local or self-hosted LLM providers.

This would enable users to specify:
- A base URL (e.g. http://localhost:8000/v1)
- A model name
- An optional API key

The configured endpoint would be used for Warp AI features such as command suggestions, agent interactions, and inline completions.

The implementation should support standard OpenAI-compatible APIs, including chat completions and streaming responses, so that it works with tools like vLLM, Ollama (via compatibility layer), and other OpenAI-compatible servers.

From a user perspective, this would extend the existing “Bring Your Own Key” model into a “Bring Your Own Endpoint” workflow, where users can choose between built-in providers and a custom endpoint.

### Is your feature request related to a problem? Please describe.

Yes.

Currently, Warp AI only supports a fixed set of providers (e.g. OpenAI, Anthropic, Google) via API keys. There is no way to connect Warp to local or self-hosted models.

This creates several limitations:

- Users cannot run Warp AI in privacy-sensitive or offline environments
- Teams with existing LLM infrastructure cannot integrate it
- Developers cannot use open-source models running locally
- All prompts must be sent to external providers, even when a local option is available

Many modern LLM tools (such as vLLM and Ollama) expose OpenAI-compatible APIs, but Warp cannot currently take advantage of this without native support for custom endpoints.

### Additional context

This request consolidates several related feature requests:

- #6026 – Custom AI model configuration
- #8708 – Custom AI API endpoints
- #4339 – Local LLM support
- #5735 – Privacy / local AI usage
- #4687 – Custom endpoint (closed as duplicate)
- #3779 – Provider routing limitations

The proposed approach is to extend Warp’s existing provider model to allow any OpenAI-compatible endpoint, rather than introducing provider-specific integrations for each tool.

This would allow immediate compatibility with:
- vLLM (OpenAI-compatible server)
- Ollama (local runtime with compatibility layer)
- LiteLLM and similar proxy systems

Example configuration:

{
  "provider": "custom",
  "base_url": "http://localhost:8000/v1",
  "model": "meta-llama/Llama-3-8b-instruct",
  "api_key": "optional"
}

This is a provider-level feature and is expected to be cross-platform across all Warp-supported operating systems.

### Operating system (OS)

Windows

### How important is this feature to you?

5 (Can't work without it!)

### Warp Internal (ignore) - linear-label:39cc6478-1249-4ee7-950b-c428edfeecd1

None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom OpenAI-Compatible Provider Endpoints (BYO Endpoint for Local & Remote LLMs) #9303

Pre-submit Checks

Describe the solution you'd like?

Is your feature request related to a problem? Please describe.

Additional context

Operating system (OS)

How important is this feature to you?

Warp Internal (ignore) - linear-label:39cc6478-1249-4ee7-950b-c428edfeecd1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Custom OpenAI-Compatible Provider Endpoints (BYO Endpoint for Local & Remote LLMs) #9303

Description

Pre-submit Checks

Describe the solution you'd like?

Is your feature request related to a problem? Please describe.

Additional context

Operating system (OS)

How important is this feature to you?

Warp Internal (ignore) - linear-label:39cc6478-1249-4ee7-950b-c428edfeecd1

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions