ai-relay

WebSocket relay that bridges AI coding agent CLIs (Claude Code, Codex, Gemini CLI, Snowflake Cortex, and more) to any web interface — stream reasoning, tool calls, and file changes in real time.

Install

pip install ai-relay

Quick start

One-shot mode (local dev)

# Start the relay server (default: ws://0.0.0.0:8765)
ai-relay --port 8765

Server mode (container / daemon)

# Persistent server — each WebSocket connection becomes one independent agent session
ai-relay serve --port 9000

Then connect from OhWise Lab (or any WebSocket client) and send a handshake:

{"tool": "claude", "folder": "/path/to/project", "model": "claude-sonnet-4-6"}

The relay streams structured JSON events over WebSocket and forwards your messages to the selected backend. Claude Code and Gemini use native JSONL process protocols, Codex uses the app-server JSON-RPC protocol, and Snowflake Cortex uses HTTP/SSE. The PTY bridge is retained only for generic/legacy CLI tools.

Running in a container

ai-relay serve is designed to run inside a Docker container as a persistent daemon:

FROM python:3.11-slim
RUN pip install ai-relay
# Install your AI CLI here (e.g. npm install -g @anthropic-ai/claude-code)
CMD ["ai-relay", "serve", "--port", "9000"]

Each incoming WebSocket connection spawns an independent agent session. Multiple clients can connect simultaneously.

Event types

Type	Description
`session_start`	Process spawned
`session_end`	Process exited (includes exit_code)
`stdout` / `stderr`	Raw output lines
`reasoning`	Agent thinking/planning text
`tool_call`	Agent invoking a tool (Read, Edit, Bash…)
`tool_result`	Result of a tool call
`file_diff`	File created or edited
`response`	Final answer text
`assistant_message`	Native structured assistant message
`user_message`	Native structured user/tool-result message
`stream_event`	Native streaming event
`status`	Native status/control event
`permission_request`	Tool permission prompt from a structured backend
`permission_cancelled`	Pending permission prompt was cancelled
`control_response`	Native control response acknowledgment
`tool_progress`	Native tool progress event
`quota_warning`	API quota / rate limit detected
`context_warning`	Context window nearing limit (includes context_pct)
`context_compacted`	Context was compacted
`error`	Relay or process error
`input_ack`	Relay confirms your message was sent to the process

Sending commands

Send JSON over WebSocket:

{"text": "refactor the authentication module to use JWT"}

Claude Code also accepts structured web-client messages:

{"type": "user_message", "content": "refactor the authentication module to use JWT"}

Permission responses:

{"type": "permission_response", "request_id": "req", "behavior": "allow", "updatedInput": {"command": "git status"}}

Codex permission responses can also use:

{"type": "permission_response", "request_id": "req", "allow": true}

Interrupt the active structured turn:

{"type": "interrupt"}

Codex uses codex app-server --listen stdio:// and keeps a persistent thread behind the WebSocket session:

{"tool": "codex", "folder": "/path/to/project", "model": "gpt-5.2"}

Gemini CLI uses headless stream-json mode. Each text message starts one Gemini turn:

{"tool": "gemini", "folder": "/path/to/project", "model": "gemini-2.5-flash"}

Snowflake Cortex uses API configuration in the handshake.

Cortex chat mode:

{
  "tool": "cortex",
  "mode": "chat",
  "model": "claude-sonnet-4-5",
  "snowflake": {
    "account_url": "https://<account>.snowflakecomputing.com",
    "token_env": "SNOWFLAKE_PAT"
  }
}

Cortex Analyst mode:

{
  "tool": "cortex",
  "mode": "analyst",
  "snowflake": {
    "account_url": "https://<account>.snowflakecomputing.com",
    "token_env": "SNOWFLAKE_PAT",
    "semantic_view": "DB.SCHEMA.VIEW"
  }
}

To send CLI commands (e.g. /compact, /clear):

{"text": "/compact"}

Supported tools

Tool	Adapter	`tool` value
Claude Code	`ClaudeCodeAdapter`	`"claude"` / `"claude-code"`
OpenAI Codex	`CodexAdapter`	`"codex"`
Gemini CLI	`GeminiAdapter`	`"gemini"`
Snowflake Cortex	`CortexAdapter`	`"cortex"`
Any CLI	`GenericAdapter`	`"generic"`

Configuration reference (ohwise-lab-ctrl)

When using ohwise-lab alongside ai-relay, all settings are controlled via environment variables — nothing is hardcoded:

Variable	Default	Description
`LAB_MODE`	`single`	`single` = subprocesses in lab-ctrl; `multi` = per-user Docker containers
`LAB_WORKSPACE_ROOT`	`/var/ohwise-lab-workspaces`	Host path for user workspace volumes
`LAB_IMAGE`	`ohwise-lab-ctrl:local`	Docker image for user containers (must have ai-relay + CLIs installed)
`LAB_NETWORK`	`lab-network`	Docker network user containers join. For Compose: `<project>_default`
`LAB_CONTAINER_PORT`	`9000`	Internal port `ai-relay serve` listens on inside user containers
`LAB_CONTAINER_USER`	`labuser`	OS user inside user containers
`LAB_CONTAINER_HOME`	`/home/<LAB_CONTAINER_USER>`	Home dir inside user containers
`LAB_CONTAINER_STARTUP_DELAY`	`1.5`	Seconds to wait for ai-relay to be ready after container start
`LAB_CONTAINER_WS_TIMEOUT`	`15`	Seconds to wait for WebSocket connection to user container
`LAB_IDLE_TIMEOUT_SECS`	`1800`	Seconds of inactivity before a user container is eligible for cleanup

Python API

from ai_relay import RelayServer

server = RelayServer(host="0.0.0.0", port=8765)
server.run()

Changelog

See CHANGELOG.md for release notes.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
ai_relay		ai_relay
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ai-relay

Install

Quick start

One-shot mode (local dev)

Server mode (container / daemon)

Running in a container

Event types

Sending commands

Supported tools

Configuration reference (ohwise-lab-ctrl)

Python API

Changelog

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ai-relay

Install

Quick start

One-shot mode (local dev)

Server mode (container / daemon)

Running in a container

Event types

Sending commands

Supported tools

Configuration reference (ohwise-lab-ctrl)

Python API

Changelog

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages