fix: backport upstream responses, gemini, auth, and codex websocket fixes onto origin/main by Arron196 · Pull Request #25 · Arron196/CLIProxyAPI

Arron196 · 2026-04-03T07:11:31Z

背景

这批修复最初在 codex/upstream-pr-batch-20260403 上完成并验证通过，但那条分支不是直接基于 origin/main。

这个 PR 重新以 origin/main（当前基线：eaf0ba6e）为起点整理成单个提交，把已经验证过的一组 upstream 回补和本地整合一次性带回主线，避免把中间工作分支历史直接带进来。

这次做了什么

1. Responses / OpenAI 兼容修复

回补 /v1/responses SSE 分帧修复，正确重组被拆开的 event: / data: 块，避免下游按不完整帧解析失败。
回补 OpenAI-compatible upstream 的 HTTP responses continuation state 维护逻辑，补齐 previous_response_id 场景下的 turn state 缓存与恢复。
增加/更新 sdk/api/handlers/openai 相关回归测试，覆盖 split SSE、continuation、websocket pin reset 等关键路径。

2. Gemini / Amp / Gemini CLI 修复

回补 Gemini Claude translator 的 tool schema 清洗逻辑，统一走 CleanJSONSchemaForGemini，同时去掉 eager_input_streaming 等 Gemini 不接受的字段。
修正 Amp streaming 路径下 thinking block 的处理，不再错误抑制流式 thinking，并补齐签名/rewriter 行为。
回补 Gemini CLI onboarding 后优先使用 backend project ID 的逻辑，避免前后端 project ID 不一致带来的登录/调用异常。

3. Executor / Auth / Codex 元数据与代理修复

回补 executor 出站 HTTP proxy 的环境变量 fallback，并保持 Antigravity 的 HTTP/1.1 transport 语义与复用策略。
回补 0 token 响应也确保发布 usage 记录的逻辑。
回补 Codex refresh 时 client_id / account_id / plan_type 的解析与保留逻辑，减少 refresh 后 metadata 丢失或错位。

4. 流式重试 / WebSocket / Auth rotation 修复

回补 stream bootstrap 错误真实触发 auth rotation 的逻辑，避免流式请求卡死在单个失败 auth 上。
回补 handler 层对 bootstrap retry channel / terminal stream error 的保留逻辑。
回补 Codex websocket 会话在额度耗尽、401/403/402/429 等场景下的解 pin / 切号 / 会话关闭 / 快照提交时机修复。
补齐 WebSocket 相关关键回归测试，包括 auth switch 重连、session close 快速失败、quota 后切换账号等。

回补 / 等价合入的 upstream PR

本 PR 主要回补了以下 upstream 变更：

#2492 fix(responses): reassemble split SSE event/data frames before streaming
#2474 fix(gemini): clean tool schemas, stream thinking passthrough, and strip trailing unanswered function calls
#2307 fix: preserve responses continuation state for openai-compatible upstreams
#2357 fix(proxy): fallback outbound HTTP clients to environment proxies
#2416 fix(gemini-cli): use backend project ID from onboarding response
#2272 fix(usage): ensure usage records are published for 0-token responses
#2153 fix(codex): resolve client_id and account_id handling in token refresh and file listing
#2210 fix(auth): stream bootstrap errors now trigger auth rotation
#2256 fix(websocket): Codex WebSocket 会话在额度耗尽后允许切换账号

此外，#2256 相关的后续 review / follow-up 修复也一并带入了这次整理结果，包括：

切号后的增量污染与读通道竞态修复
切号恢复时上下文快照截断修复
会话关闭挂起修复
失败轮次不提交会话快照
401 解 pin 与相关可观测性补充

与 origin/main 的整合说明

因为本 PR 是直接基于 origin/main 重新整理的，所以冲突处理时优先保留了 origin/main 上已经存在的更新逻辑，再把本批回补叠上去。比较关键的一点是：

保留了 origin/main 当前已有的 websocket tool repair 并发安全实现
在此基础上叠加了本批 websocket/auth/pin reset/snapshot 管理修复

也就是说，这不是简单地把旧工作分支整段搬过来，而是一次面向 origin/main 的重整合。

Test Plan

go test ./... -count=1

…ixes onto origin/main

Copilot

Pull request overview

This PR backports and re-integrates a batch of upstream and local fixes onto origin/main, targeting OpenAI /v1/responses (SSE + websocket) correctness, Gemini/Amp compatibility, proxy/env behavior, usage publishing, and Codex auth/websocket robustness.

Changes:

Fix OpenAI Responses streaming behaviors: SSE framing/reassembly, HTTP continuation turn-state caching, and websocket pin reset + snapshot handling with added regression tests.
Improve executor/auth reliability: bootstrap retry behavior, auth rotation on stream bootstrap failures, ensure usage publication for 0-token responses, and Codex refresh metadata preservation.
Improve integration compatibility: Gemini tool schema cleaning + onboarding backend project ID preference; Amp response/request rewriting for signature/thinking handling; environment proxy fallback behavior.

Reviewed changes

Copilot reviewed 38 out of 39 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
sdk/cliproxy/auth/conductor.go	Adjusts stream bootstrap failure handling across model pools (auth rotation-related behavior).
sdk/cliproxy/auth/conductor_stream_retry_test.go	Adds tests covering auth rotation on bootstrap error / empty stream.
sdk/api/handlers/openai/openai_responses_websocket.go	Adds websocket snapshot handling, pin reset on quota/auth errors, and incremental-mode gating after auth reset.
sdk/api/handlers/openai/openai_responses_websocket_test.go	Expands websocket regression tests (pin reset, quota switch, status plumbing).
sdk/api/handlers/openai/openai_responses_turn_state.go	Introduces in-memory TTL cache for `/v1/responses` continuation turn state.
sdk/api/handlers/openai/openai_responses_http_continuation_test.go	Adds HTTP continuation tests validating cached-turn merge behavior.
sdk/api/handlers/openai/openai_responses_handlers.go	Implements SSE framing/reassembly and caches completed turns from streaming/non-streaming responses.
sdk/api/handlers/openai/openai_responses_handlers_stream_test.go	Adds unit tests for SSE framing edge-cases (split event/data, split JSON, flush).
sdk/api/handlers/openai/openai_responses_handlers_stream_error_test.go	Updates streaming terminal error test to new forwarder signature.
sdk/api/handlers/handlers.go	Refactors streaming bootstrap retry and wraps some immediate stream errors into stream results.
sdk/api/handlers/handlers_stream_bootstrap_test.go	Adds coverage for split OpenAI Responses SSE event lines through the handler stream path.
internal/translator/gemini/claude/gemini_claude_request.go	Cleans Gemini tool schemas, strips unsupported fields, and drops trailing unanswered model function calls.
internal/runtime/executor/qwen_executor.go	Ensures usage records are published even for 0-token responses.
internal/runtime/executor/proxy_helpers.go	Adds environment proxy fallback and transport caching for outbound HTTP clients.
internal/runtime/executor/proxy_helpers_test.go	Adds tests for environment proxy fallback, precedence, and transport reuse.
internal/runtime/executor/openai_compat_executor.go	Ensures usage records are published even for 0-token responses.
internal/runtime/executor/kimi_executor.go	Ensures usage records are published even for 0-token responses.
internal/runtime/executor/iflow_executor.go	Ensures usage records are published even for 0-token responses.
internal/runtime/executor/gemini_vertex_executor.go	Ensures usage records are published even for 0-token responses.
internal/runtime/executor/gemini_executor.go	Ensures usage records are published even for 0-token responses.
internal/runtime/executor/gemini_cli_executor.go	Ensures usage records are published even for 0-token responses.
internal/runtime/executor/codex_websockets_executor.go	Hardens Codex websocket session concurrency and reconnection behavior (auth switch, active channel ownership).
internal/runtime/executor/codex_websockets_executor_test.go	Adds tests for read unblocking, auth-change reconnect, and session close fast-fail.
internal/runtime/executor/codex_executor.go	Preserves/derives `client_id` on refresh and publishes usage on 0-token paths.
internal/runtime/executor/codex_executor_account_id_test.go	Adds a manual integration-style test for account check / plan selection logic (skipped unless env set).
internal/runtime/executor/claude_executor.go	Ensures usage records are published even for 0-token responses.
internal/runtime/executor/antigravity_executor.go	Reuses a shared env-proxy HTTP/1.1 transport for Antigravity to avoid pool leaks/mutation.
internal/runtime/executor/antigravity_executor_client_test.go	Tests Antigravity env-proxy transport sharing and HTTP/2 disabling.
internal/runtime/executor/aistudio_executor.go	Ensures usage records are published even for 0-token responses.
internal/misc/header_utils.go	Updates Gemini CLI UA version/string (adds `terminal` marker).
internal/cmd/login.go	Prefers backend project ID from onboarding response.
internal/auth/codex/openai_auth.go	Allows refresh token flow to accept an override `client_id`.
internal/auth/codex/openai_auth_test.go	Updates tests for new refresh token API signature.
internal/auth/codex/jwt_parser.go	Adds JWT helper to derive `client_id` from `aud`.
internal/api/modules/amp/response_rewriter.go	Improves Amp response rewriting (signature injection, thinking suppression toggles, streaming heuristics) and adds request sanitizer.
internal/api/modules/amp/response_rewriter_test.go	Adds tests for streaming thinking passthrough and request signature sanitization.
internal/api/modules/amp/fallback_handlers.go	Applies request sanitizer and expands response rewriting to local-provider paths.
internal/api/handlers/management/auth_files.go	Preserves Codex id_token-derived fields while allowing metadata overrides; aligns Gemini project ID selection behavior.
.gitignore	Ignores `.gocache/`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-03T08:02:27Z

sdk/cliproxy/auth/conductor.go

 		buffered, closed, bootstrapErr := readStreamBootstrap(ctx, streamResult.Chunks)
 		if bootstrapErr != nil {
 			if errCtx := ctx.Err(); errCtx != nil {
 				discardStreamChunks(streamResult.Chunks)
 				return nil, errCtx
 			}
 			if isRequestInvalidError(bootstrapErr) {
 				rerr := &Error{Message: bootstrapErr.Error()}
 				if se, ok := errors.AsType[cliproxyexecutor.StatusError](bootstrapErr); ok && se != nil {
 					rerr.HTTPStatus = se.StatusCode()
 				}
 				result := Result{AuthID: auth.ID, Provider: provider, Model: resultModel, Success: false, Error: rerr}
 				result.RetryAfter = retryAfterFromError(bootstrapErr)
 				m.Hook().OnResult(ctx, result)
 				discardStreamChunks(streamResult.Chunks)
 				return nil, bootstrapErr
 			}
-			if idx < len(execModels)-1 {
-				rerr := &Error{Message: bootstrapErr.Error()}
-				if se, ok := errors.AsType[cliproxyexecutor.StatusError](bootstrapErr); ok && se != nil {
-					rerr.HTTPStatus = se.StatusCode()
-				}
-				result := Result{AuthID: auth.ID, Provider: provider, Model: resultModel, Success: false, Error: rerr}
-				result.RetryAfter = retryAfterFromError(bootstrapErr)
-				m.MarkResult(ctx, result)
-				discardStreamChunks(streamResult.Chunks)
-				lastErr = bootstrapErr
-				continue
-			}
 			rerr := &Error{Message: bootstrapErr.Error()}
 			if se, ok := errors.AsType[cliproxyexecutor.StatusError](bootstrapErr); ok && se != nil {
 				rerr.HTTPStatus = se.StatusCode()
 			}
 			result := Result{AuthID: auth.ID, Provider: provider, Model: resultModel, Success: false, Error: rerr}
 			result.RetryAfter = retryAfterFromError(bootstrapErr)
 			m.MarkResult(ctx, result)
 			discardStreamChunks(streamResult.Chunks)
-			return nil, newStreamBootstrapError(bootstrapErr, streamResult.Headers)
+			lastErr = bootstrapErr
+			continue
 		}

 		if closed && len(buffered) == 0 {
 			emptyErr := &Error{Code: "empty_stream", Message: "upstream stream closed before first payload", Retryable: true}
 			result := Result{AuthID: auth.ID, Provider: provider, Model: resultModel, Success: false, Error: emptyErr}
 			m.MarkResult(ctx, result)
-			if idx < len(execModels)-1 {
-				lastErr = emptyErr
-				continue
-			}
-			return nil, newStreamBootstrapError(emptyErr, streamResult.Headers)
+			lastErr = emptyErr
+			continue


executeStreamWithModelPool no longer wraps bootstrap/empty-stream failures with newStreamBootstrapError(...), which means executeStreamMixedOnce can no longer convert the final bootstrap failure into a StreamResult carrying upstream headers (streamErrorResult(bootstrapErr.Headers(), ...)). This looks like a regression for passthrough headers on bootstrap failures (e.g. Retry-After), and streamBootstrapError / newStreamBootstrapError now appear unused. Consider either (1) reintroducing the wrapper but make it also preserve StatusCode() by delegating to the underlying error, or (2) propagate streamResult.Headers through another channel/type so callers can keep both status code and headers.

Preserved bootstrap headers without changing the Manager API contract. Final bootstrap and empty-stream failures now keep a streamBootstrapError wrapper with delegated StatusCode(), and handler-side immediate stream wrapping now carries Headers() into the synthetic StreamResult. This keeps ExecuteStream returning an error while still forwarding headers like Retry-After downstream.

Copilot · 2026-04-03T08:02:27Z

internal/runtime/executor/proxy_helpers.go

+func environmentProxyConfigured() bool {
+	for _, key := range environmentProxyKeys {
+		if strings.TrimSpace(os.Getenv(key)) != "" {
+			return true
+		}
+	}
+	return false
+}
+
+func newEnvironmentProxyTransport() *http.Transport {
+	signature := environmentProxySignature()
+	if cached, ok := environmentProxyTransportCache.Load(signature); ok {
+		return cached.(*http.Transport)
+	}
+
+	proxyFunc := environmentProxyFunc()
+	var transport *http.Transport
+	if base, ok := http.DefaultTransport.(*http.Transport); ok && base != nil {
+		clone := base.Clone()
+		clone.Proxy = proxyFunc
+		transport = clone
+	} else {
+		transport = &http.Transport{Proxy: proxyFunc}
+	}
+	actual, _ := environmentProxyTransportCache.LoadOrStore(signature, transport)
+	return actual.(*http.Transport)
+}
+
+func environmentProxySignature() string {
+	var values []string
+	for _, key := range environmentProxyKeys {
+		values = append(values, key+"="+strings.TrimSpace(os.Getenv(key)))
+	}
+	return strings.Join(values, "|")
+}
+
+func environmentProxyFunc() func(*http.Request) (*url.URL, error) {
+	httpProxy := firstEnvironmentValue("HTTP_PROXY", "http_proxy")
+	httpsProxy := firstEnvironmentValue("HTTPS_PROXY", "https_proxy")
+	allProxy := firstEnvironmentValue("ALL_PROXY", "all_proxy")
+
+	return func(req *http.Request) (*url.URL, error) {
+		if req == nil || req.URL == nil {
+			return nil, nil
+		}
+
+		raw := ""
+		switch strings.ToLower(req.URL.Scheme) {
+		case "https":
+			raw = firstNonEmpty(httpsProxy, allProxy, httpProxy)
+		case "http":
+			raw = firstNonEmpty(httpProxy, allProxy, httpsProxy)
+		default:
+			raw = firstNonEmpty(allProxy, httpsProxy, httpProxy)
+		}
+		raw = strings.TrimSpace(raw)
+		if raw == "" {
+			return nil, nil
+		}
+		if !strings.Contains(raw, "://") {
+			raw = "http://" + raw
+		}
+		return url.Parse(raw)
+	}
+}


The custom environment proxy resolver (environmentProxyFunc) ignores standard NO_PROXY / no_proxy bypass rules, unlike http.ProxyFromEnvironment. This can inadvertently route requests that should be direct (e.g. localhost, in-cluster addresses) through the proxy, which is both a correctness and operational risk. Suggest using http.ProxyFromEnvironment directly for the env-proxy case, or implementing NO_PROXY matching and including it in the cache signature.

Switched the env-proxy fallback to golang.org/x/net/http/httpproxy.FromEnvironment().ProxyFunc() and included NO_PROXY / no_proxy in the transport cache signature. That keeps the transport reuse we wanted, while restoring the standard bypass behavior for localhost, cluster-local, and other no-proxy targets.

Copilot · 2026-04-03T08:02:27Z

internal/runtime/executor/codex_executor_account_id_test.go

+func TestCodexAccountCheck(t *testing.T) {
+	accessToken := os.Getenv("CODEX_ACCESS_TOKEN")
+	if accessToken == "" {
+		t.Skip("skipping: CODEX_ACCESS_TOKEN not set")
+	}
+	proxyURL := os.Getenv("CODEX_PROXY_URL")
+	deviceID := uuid.NewString()
+	targetURL := "https://chatgpt.com/backend-api/accounts/check/v4-2023-04-27?timezone_offset_min=-480"
+
+	req, err := http.NewRequest(http.MethodGet, targetURL, nil)
+	if err != nil {
+		t.Fatalf("build request: %v", err)
+	}
+
+	req.Header.Set("accept", "*/*")
+	req.Header.Set("accept-language", "zh-HK,zh;q=0.9,en-US;q=0.8,en;q=0.7")
+	req.Header.Set("authorization", "Bearer "+strings.TrimSpace(accessToken))
+	req.Header.Set("oai-device-id", deviceID)
+	req.Header.Set("oai-language", "zh-HK")
+	req.Header.Set("referer", "https://chatgpt.com/")
+	req.Header.Set("user-agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36")
+	req.Header.Set("sec-ch-ua", `"Google Chrome";v="141", "Not?A_Brand";v="8", "Chromium";v="141"`)
+	req.Header.Set("sec-ch-ua-mobile", "?0")
+	req.Header.Set("sec-ch-ua-platform", `"macOS"`)
+	req.Header.Set("sec-fetch-dest", "empty")
+	req.Header.Set("sec-fetch-mode", "cors")
+	req.Header.Set("sec-fetch-site", "same-origin")
+	req.Header.Set("priority", "u=1, i")
+
+	client := &http.Client{
+		Transport: newUtlsTransport(proxyURL),
+	}
+
+	resp, err := client.Do(req)
+	if err != nil {
+		t.Fatalf("request failed: %v", err)
+	}


This test performs a real network call when CODEX_ACCESS_TOKEN is set, but the http.Client has no timeout and the request is not bound to a context deadline. If the upstream or proxy hangs, this can stall go test runs. Consider adding a client timeout / per-request context with deadline, and/or guarding the test behind an explicit build tag (e.g. //go:build integration) so it can't be enabled accidentally in automated environments.

Hardened this path in two ways: the live account-check test is now integration-only (//go:build integration), and it also uses both a request context deadline and an http.Client timeout. That keeps ordinary go test runs from triggering or hanging on this external call.

…ntegration test

Copilot

Pull request overview

Copilot reviewed 38 out of 39 changed files in this pull request and generated 30 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-03T09:45:51Z

internal/runtime/executor/qwen_executor.go

 	defer reporter.trackFailure(ctx, &err)
+	defer reporter.ensurePublished(ctx)


ensurePublished is deferred after trackFailure, so it runs first (LIFO). That publishes a success usage record even when err is non-nil, and then trackFailure cannot publish the failure record because usageReporter.once has already fired. Swap the defer order (defer ensurePublished first, then defer trackFailure) or gate ensurePublished on err == nil so failures are recorded correctly.

Suggested change

defer reporter.trackFailure(ctx, &err)

defer reporter.ensurePublished(ctx)

defer reporter.ensurePublished(ctx)

defer reporter.trackFailure(ctx, &err)