Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cmd/gateway_consumer_handlers.go
Original file line number Diff line number Diff line change
Expand Up @@ -459,6 +459,7 @@ func handleResetCommand(
deps.SessStore.Reset(ctx, sessionKey)
deps.SessStore.Save(ctx, sessionKey)
providers.ResetCLISession("", sessionKey)
providers.ResetCursorCLISession("", sessionKey)
slog.Info("inbound: /reset command", "session", sessionKey)

return true
Expand Down
46 changes: 46 additions & 0 deletions cmd/gateway_providers.go
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,26 @@ func registerProviders(registry *providers.Registry, cfg *config.Config) {
slog.Info("registered provider", "name", "claude-cli")
}

// Cursor CLI — browser auth on the server (`agent login`).
if cfg.Providers.CursorCLI.CLIPath != "" {
cliPath := cfg.Providers.CursorCLI.CLIPath
var opts []providers.CursorCLIOption
if cfg.Providers.CursorCLI.Model != "" {
opts = append(opts, providers.WithCursorCLIModel(cfg.Providers.CursorCLI.Model))
}
if cfg.Providers.CursorCLI.BaseWorkDir != "" {
opts = append(opts, providers.WithCursorCLIWorkDir(cfg.Providers.CursorCLI.BaseWorkDir))
}
if cfg.Providers.CursorCLI.PermMode != "" {
opts = append(opts, providers.WithCursorCLIPermMode(cfg.Providers.CursorCLI.PermMode))
}
gatewayAddr := loopbackAddr(cfg.Gateway.Host, cfg.Gateway.Port)
mcpData := providers.BuildCLIMCPConfigData(cfg.Tools.McpServers, gatewayAddr, cfg.Gateway.Token)
opts = append(opts, providers.WithCursorCLIMCPConfigData(mcpData))
registry.Register(providers.NewCursorCLIProvider(cliPath, opts...))
slog.Info("registered provider", "name", "cursor-cli")
}

// ACP provider (config-based) — orchestrates any ACP-compatible agent binary
if cfg.Providers.ACP.Binary != "" {
registerACPFromConfig(registry, cfg.Providers.ACP)
Expand Down Expand Up @@ -278,6 +298,32 @@ func registerProvidersFromDB(registry *providers.Registry, provStore store.Provi
slog.Info("registered provider from DB", "name", p.Name)
continue
}
if p.ProviderType == store.ProviderCursorCLI {
cliPath := p.APIBase //reuse APIBase field for CLI path
if cliPath == "" {
cliPath = "agent"
}
if cliPath != "agent" && !filepath.IsAbs(cliPath) {
slog.Warn("security.cursor_cli: invalid path from DB, using default", "path", cliPath)
cliPath = "agent"
}
if _, err := exec.LookPath(cliPath); err != nil {
slog.Warn("cursor-cli: binary not found, skipping", "path", cliPath, "error", err)
continue
}
var cursorOpts []providers.CursorCLIOption
if pm := providers.PermModeFromCursorCLISettings(p.Settings); pm != "" {
cursorOpts = append(cursorOpts, providers.WithCursorCLIPermMode(pm))
}
if gatewayAddr != "" {
mcpData := providers.BuildCLIMCPConfigData(nil, gatewayAddr, gatewayToken)
mcpData.AgentMCPLookup = buildMCPServerLookup(mcpStore)
cursorOpts = append(cursorOpts, providers.WithCursorCLIMCPConfigData(mcpData))
}
registry.Register(providers.NewCursorCLIProvider(cliPath, cursorOpts...))
slog.Info("registered provider from DB", "name", p.Name)
continue
}
// ACP provider — no API key needed (agents manage their own auth).
if p.ProviderType == store.ProviderACP {
registerACPFromDB(registry, p)
Expand Down
113 changes: 109 additions & 4 deletions docs/02-providers.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# 02 - LLM Providers

GoClaw abstracts LLM communication behind a single `Provider` interface, allowing the agent loop to work with any backend without knowing the wire format. Six concrete implementations exist: Anthropic (native HTTP+SSE), OpenAI-compatible (covering 10+ API endpoints), Claude CLI (local binary), Codex (OAuth-based), ACP (subagent orchestration), and DashScope (Alibaba Qwen with thinking).
GoClaw abstracts LLM communication behind a single `Provider` interface, allowing the agent loop to work with any backend without knowing the wire format. Seven concrete implementations exist: Anthropic (native HTTP+SSE), OpenAI-compatible (covering 10+ API endpoints), Claude CLI (local binary), Cursor CLI (local binary), Codex (OAuth-based), ACP (subagent orchestration), and DashScope (Alibaba Qwen with thinking).

---

Expand All @@ -15,9 +15,11 @@ flowchart TD
PI --> ANTH["Anthropic Provider<br/>native net/http + SSE"]
PI --> OAI["OpenAI-Compatible Provider<br/>generic HTTP client"]
PI --> CLAUDE["Claude CLI Provider<br/>stdio subprocess"]
PI --> CURSOR["Cursor CLI Provider<br/>stdio subprocess"]
PI --> CODEX["Codex Provider<br/>OAuth-based Responses API"]
PI --> ACP["ACP Provider<br/>JSON-RPC 2.0 subagents"]
PI --> DASH["DashScope Provider<br/>OpenAI-compat wrapper"]
PI --> CURSOR["Cursor CLI Provider<br/>stdio subprocess"]

ANTH --> ANTHROPIC["Claude API<br/>api.anthropic.com/v1"]
OAI --> OPENAI["OpenAI API"]
Expand All @@ -27,35 +29,41 @@ flowchart TD
OAI --> GEM["Gemini API"]
OAI --> OTHER["Mistral / xAI / MiniMax<br/>Cohere / Perplexity / Ollama"]
CLAUDE --> CLI["claude CLI binary<br/>stdio + MCP bridge"]
CURSOR --> CURSOR_CLI["agent (Cursor) binary<br/>stdio + MCP bridge"]
CODEX --> CODEX_API["ChatGPT Responses API<br/>chatgpt.com/backend-api"]
ACP --> AGENTS["Claude Code / Codex<br/>Gemini CLI agents"]
DASH --> QWEN["Alibaba DashScope<br/>Qwen3 models"]
CURSOR --> CURSOR_CLI["agent (Cursor) binary<br/>stdio + MCP bridge"]
```

Authentication and timeouts vary by provider type:
- **Anthropic**: `x-api-key` header + `anthropic-version: 2023-06-01`
- **OpenAI-compatible**: `Authorization: Bearer` token
- **Claude CLI**: stdio subprocess (no auth; uses local CLI session)
- **Cursor CLI**: `CURSOR_API_KEY` env var injection per-call
- **Codex**: OAuth access token (auto-refreshed via TokenSource)
- **ACP**: JSON-RPC 2.0 over subprocess stdio
- **DashScope**: `Authorization: Bearer` token (inherits from OpenAI-compatible)
- **Cursor CLI**: stdio subprocess (browser auth via `agent login` on the server; no API key in GoClaw)

All HTTP-based providers (Anthropic, OpenAI-compatible, Codex) use 300-second timeout.

---

## 2. Supported Providers

### Six Core Provider Types
### Seven Core Provider Types

| Provider | Type | Configuration | Default Model |
|----------|------|----------|---------------|
| **anthropic** | Native HTTP + SSE | API key required | `claude-sonnet-4-5-20250929` |
| **claude_cli** | stdio subprocess + MCP | Binary path (default: `claude`) | `sonnet` |
| **cursor_cli** | stdio subprocess + MCP | API key env var (default binary: `agent`) | `cursor-fast` |
| **codex** | OAuth Responses API | OAuth token source | `gpt-5.3-codex` |
| **acp** | JSON-RPC 2.0 subagents | Binary + workspace dir | `claude` |
| **dashscope** | OpenAI-compat wrapper | API key + custom models | `qwen3-max` |
| **openai** (+ 10+ variants) | OpenAI-compatible | API key + endpoint URL | Model-specific |
| **cursor_cli** | stdio subprocess + MCP | Binary path (default: `agent`), browser login | `composer-2` |

### OpenAI-Compatible Providers

Expand Down Expand Up @@ -557,7 +565,98 @@ Claude CLI inherits thinking support from the underlying Claude model. Thinking

---

## 12. Codex Provider
## 12. Cursor CLI Provider

The Cursor CLI provider enables GoClaw to delegate requests to a local `agent` (Cursor) CLI binary. Like Claude CLI, it manages session history, context files, and tool execution independently. Cursor provides fast inference with multimodal support and extended context.

### Architecture Overview

```mermaid
flowchart TD
AL["Agent Loop"] -->|Chat / ChatStream| CLI["CursorCLIProvider"]
CLI --> POOL["SessionPool"]
POOL -->|spawn/reuse| PROC["Subprocess<br/>agent --print --output-format stream-json"]
PROC -->|manages| SESS["Session<br/>(chat ID, history)"]

SESS -->|AGENTS.md system prompt| TOOLS["CLI Tool Execution"]
SESS -->|.cursor/mcp.json MCP config| TOOLS
SESS -->|--resume chatId| TOOLS

TOOLS -->|via MCP| MCP["MCP Servers<br/>(if configured)"]
```

### Configuration

CursorCLIProvider can be configured in `config.json`:

```json5
{
"providers": {
"cursor_cli": {
"cli_path": "agent", // optional; default binary name or absolute path
"model": "composer-2", // default model
"base_work_dir": "/tmp/cursor-workspaces", // workspace directory base
"perm_mode": "force" // optional: "force" | "default" | "sandbox" (see Headless Flags)
}
}
}
```

Or via database `llm_providers` table with `provider_type = "cursor_cli"`. Store `perm_mode` in `settings` JSON (same values as `config.json`).

### Authentication

Prefer **browser authentication** on the machine where GoClaw runs (same pattern as Cursor docs):

1. Run `agent login` on the server (`agent help login`; set `NO_OPEN_BROWSER` to skip opening a browser).
2. Check with `agent status` (output may be JSON or plain text). GoClaw exposes `GET /v1/providers/cursor-cli/auth-status`, which runs `agent status` and falls back to `agent about` if needed.

GoClaw does not read or store Cursor API keys; subprocess env strips inherited `CURSOR*` variables so the CLI uses the on-disk session from `agent login`.

Environment variables:

- `GOCLAW_CURSOR_CLI_PATH` — CLI binary path override
- `GOCLAW_CURSOR_CLI_MODEL` — Default model override
- `GOCLAW_CURSOR_CLI_WORK_DIR` — Base workspace directory override
- `GOCLAW_CURSOR_CLI_PERM_MODE` — Permission mode override (`force`, `default`, `sandbox`)

### Session Management

Each conversation gets a persistent session tied to `session_key` option. Sessions survive across multiple requests and maintain:
- Chat ID (server-assigned; persisted to `.cursor_session_id`)
- Workspace directory (for file operations)
- MCP server connections
- System prompt file (`AGENTS.md`)

### Tool Execution

Cursor CLI executes tools natively (filesystem, web, terminal). GoClaw forwards tool results back and lets the CLI loop continue. This mirrors Claude CLI's execution model.

### Headless Flags

The provider invokes `agent` with these critical flags:
- `--print` — stream output to stdout
- `--output-format stream-json` — use structured event format
- `--force` — bypass confirmation prompts (omitted when `perm_mode` is `default` or `strict`)
- `--trust` — skip workspace security dialogs
- `--sandbox enabled` — added when `perm_mode` is `sandbox`
- `--workspace <workdir>` — set working directory
- `--approve-mcps` — auto-approve MCP server connections (if configured)
- `--resume <chatId>` — resume existing conversation (if session ID found)

`perm_mode` (config `providers.cursor_cli.perm_mode`, env `GOCLAW_CURSOR_CLI_PERM_MODE`, or DB `settings.perm_mode`) controls `--force` / `--sandbox`:
- **`force`** (default) — `--force` + `--trust` for unattended headless runs
- **`default`** — `--trust` only (stricter; same as `strict`)
- **`sandbox`** — `--force` + `--trust` + `--sandbox enabled`

### Streaming

- **Chat**: Returns complete response after CLI execution
- **ChatStream**: Streams text chunks as they are produced by the CLI

---

## 14. Codex Provider

The Codex provider integrates with OpenAI's ChatGPT Responses API (OAuth-based), enabling access to gpt-5.3-codex model through the chatgpt.com backend. Unlike standard OpenAI endpoints, Codex uses OAuth token refresh and a custom response format with "phase" markers.

Expand Down Expand Up @@ -700,7 +799,7 @@ Reasoning behavior:

---

## 14. File Reference
## 15. File Reference

| File | Purpose |
|------|---------|
Expand All @@ -720,6 +819,12 @@ Reasoning behavior:
| `internal/providers/claude_cli_deny_patterns.go` | Path validation and deny pattern enforcement |
| `internal/providers/claude_cli_hooks.go` | Security hooks configuration for CLI tool execution |
| `internal/providers/claude_cli_types.go` | Internal types for CLI provider (session, config, options) |
| `internal/providers/cursor_cli.go` | CursorCLIProvider: orchestrates local cursor agent binary via stdio |
| `internal/providers/cursor_cli_chat.go` | Chat/ChatStream implementation for Cursor CLI provider |
| `internal/providers/cursor_cli_session.go` | Session management: workspace, system prompt, session ID persistence |
| `internal/providers/cursor_cli_mcp.go` | MCP configuration for Cursor CLI provider |
| `internal/providers/cursor_cli_auth.go` | Browser auth status via `agent status` / `agent about` |
| `internal/providers/cursor_cli_parse.go` | Response parsing and chat ID extraction from stream-json |
| `internal/providers/codex.go` | CodexProvider: OAuth-based ChatGPT Responses API |
| `internal/providers/codex_build.go` | Codex request builder: message formatting, phase handling |
| `internal/providers/codex_types.go` | Codex request/response types and OAuth token management |
Expand Down
Loading