Skip to content

jw-open/ai-relay

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ai-relay

WebSocket relay that bridges AI coding agent CLIs (Claude Code, Codex, Gemini CLI, Snowflake Cortex, and more) to any web interface — stream reasoning, tool calls, and file changes in real time.

Install

pip install ai-relay

Quick start

One-shot mode (local dev)

# Start the relay server (default: ws://0.0.0.0:8765)
ai-relay --port 8765

Server mode (container / daemon)

# Persistent server — each WebSocket connection becomes one independent agent session
ai-relay serve --port 9000

Then connect from OhWise Lab (or any WebSocket client) and send a handshake:

{"tool": "claude", "folder": "/path/to/project", "model": "claude-sonnet-4-6"}

The relay streams structured JSON events over WebSocket and forwards your messages to the selected backend. Claude Code and Gemini use native JSONL process protocols, Codex uses the app-server JSON-RPC protocol, and Snowflake Cortex uses HTTP/SSE. The PTY bridge is retained only for generic/legacy CLI tools.

Running in a container

ai-relay serve is designed to run inside a Docker container as a persistent daemon:

FROM python:3.11-slim
RUN pip install ai-relay
# Install your AI CLI here (e.g. npm install -g @anthropic-ai/claude-code)
CMD ["ai-relay", "serve", "--port", "9000"]

Each incoming WebSocket connection spawns an independent agent session. Multiple clients can connect simultaneously.

Event types

Type Description
session_start Process spawned
session_end Process exited (includes exit_code)
stdout / stderr Raw output lines
reasoning Agent thinking/planning text
tool_call Agent invoking a tool (Read, Edit, Bash…)
tool_result Result of a tool call
file_diff File created or edited
response Final answer text
assistant_message Native structured assistant message
user_message Native structured user/tool-result message
stream_event Native streaming event
status Native status/control event
permission_request Tool permission prompt from a structured backend
permission_cancelled Pending permission prompt was cancelled
control_response Native control response acknowledgment
tool_progress Native tool progress event
quota_warning API quota / rate limit detected
context_warning Context window nearing limit (includes context_pct)
context_compacted Context was compacted
error Relay or process error
input_ack Relay confirms your message was sent to the process

Sending commands

Send JSON over WebSocket:

{"text": "refactor the authentication module to use JWT"}

Claude Code also accepts structured web-client messages:

{"type": "user_message", "content": "refactor the authentication module to use JWT"}

Permission responses:

{"type": "permission_response", "request_id": "req", "behavior": "allow", "updatedInput": {"command": "git status"}}

Codex permission responses can also use:

{"type": "permission_response", "request_id": "req", "allow": true}

Interrupt the active structured turn:

{"type": "interrupt"}

Codex uses codex app-server --listen stdio:// and keeps a persistent thread behind the WebSocket session:

{"tool": "codex", "folder": "/path/to/project", "model": "gpt-5.2"}

Gemini CLI uses headless stream-json mode. Each text message starts one Gemini turn:

{"tool": "gemini", "folder": "/path/to/project", "model": "gemini-2.5-flash"}

Snowflake Cortex uses API configuration in the handshake.

Cortex chat mode:

{
  "tool": "cortex",
  "mode": "chat",
  "model": "claude-sonnet-4-5",
  "snowflake": {
    "account_url": "https://<account>.snowflakecomputing.com",
    "token_env": "SNOWFLAKE_PAT"
  }
}

Cortex Analyst mode:

{
  "tool": "cortex",
  "mode": "analyst",
  "snowflake": {
    "account_url": "https://<account>.snowflakecomputing.com",
    "token_env": "SNOWFLAKE_PAT",
    "semantic_view": "DB.SCHEMA.VIEW"
  }
}

To send CLI commands (e.g. /compact, /clear):

{"text": "/compact"}

Supported tools

Tool Adapter tool value
Claude Code ClaudeCodeAdapter "claude" / "claude-code"
OpenAI Codex CodexAdapter "codex"
Gemini CLI GeminiAdapter "gemini"
Snowflake Cortex CortexAdapter "cortex"
Any CLI GenericAdapter "generic"

Configuration reference (ohwise-lab-ctrl)

When using ohwise-lab alongside ai-relay, all settings are controlled via environment variables — nothing is hardcoded:

Variable Default Description
LAB_MODE single single = subprocesses in lab-ctrl; multi = per-user Docker containers
LAB_WORKSPACE_ROOT /var/ohwise-lab-workspaces Host path for user workspace volumes
LAB_IMAGE ohwise-lab-ctrl:local Docker image for user containers (must have ai-relay + CLIs installed)
LAB_NETWORK lab-network Docker network user containers join. For Compose: <project>_default
LAB_CONTAINER_PORT 9000 Internal port ai-relay serve listens on inside user containers
LAB_CONTAINER_USER labuser OS user inside user containers
LAB_CONTAINER_HOME /home/<LAB_CONTAINER_USER> Home dir inside user containers
LAB_CONTAINER_STARTUP_DELAY 1.5 Seconds to wait for ai-relay to be ready after container start
LAB_CONTAINER_WS_TIMEOUT 15 Seconds to wait for WebSocket connection to user container
LAB_IDLE_TIMEOUT_SECS 1800 Seconds of inactivity before a user container is eligible for cleanup

Python API

from ai_relay import RelayServer

server = RelayServer(host="0.0.0.0", port=8765)
server.run()

Changelog

See CHANGELOG.md for release notes.

License

MIT

About

WebSocket relay that bridges AI agents (Claude Code, Codex, Gemini CLI, and more) to any web interface — stream reasoning, tool calls, and file changes in real time.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages