Skip to content

fix: backport upstream responses, gemini, auth, and codex websocket fixes onto origin/main#25

Merged
Arron196 merged 4 commits intomainfrom
codex/origin-main-upstream-backports-20260403
Apr 3, 2026
Merged

fix: backport upstream responses, gemini, auth, and codex websocket fixes onto origin/main#25
Arron196 merged 4 commits intomainfrom
codex/origin-main-upstream-backports-20260403

Conversation

@Arron196
Copy link
Copy Markdown
Owner

@Arron196 Arron196 commented Apr 3, 2026

背景

这批修复最初在 codex/upstream-pr-batch-20260403 上完成并验证通过,但那条分支不是直接基于 origin/main

这个 PR 重新以 origin/main(当前基线:eaf0ba6e)为起点整理成单个提交,把已经验证过的一组 upstream 回补和本地整合一次性带回主线,避免把中间工作分支历史直接带进来。

这次做了什么

1. Responses / OpenAI 兼容修复

  • 回补 /v1/responses SSE 分帧修复,正确重组被拆开的 event: / data: 块,避免下游按不完整帧解析失败。
  • 回补 OpenAI-compatible upstream 的 HTTP responses continuation state 维护逻辑,补齐 previous_response_id 场景下的 turn state 缓存与恢复。
  • 增加/更新 sdk/api/handlers/openai 相关回归测试,覆盖 split SSE、continuation、websocket pin reset 等关键路径。

2. Gemini / Amp / Gemini CLI 修复

  • 回补 Gemini Claude translator 的 tool schema 清洗逻辑,统一走 CleanJSONSchemaForGemini,同时去掉 eager_input_streaming 等 Gemini 不接受的字段。
  • 修正 Amp streaming 路径下 thinking block 的处理,不再错误抑制流式 thinking,并补齐签名/rewriter 行为。
  • 回补 Gemini CLI onboarding 后优先使用 backend project ID 的逻辑,避免前后端 project ID 不一致带来的登录/调用异常。

3. Executor / Auth / Codex 元数据与代理修复

  • 回补 executor 出站 HTTP proxy 的环境变量 fallback,并保持 Antigravity 的 HTTP/1.1 transport 语义与复用策略。
  • 回补 0 token 响应也确保发布 usage 记录的逻辑。
  • 回补 Codex refresh 时 client_id / account_id / plan_type 的解析与保留逻辑,减少 refresh 后 metadata 丢失或错位。

4. 流式重试 / WebSocket / Auth rotation 修复

  • 回补 stream bootstrap 错误真实触发 auth rotation 的逻辑,避免流式请求卡死在单个失败 auth 上。
  • 回补 handler 层对 bootstrap retry channel / terminal stream error 的保留逻辑。
  • 回补 Codex websocket 会话在额度耗尽、401/403/402/429 等场景下的解 pin / 切号 / 会话关闭 / 快照提交时机修复。
  • 补齐 WebSocket 相关关键回归测试,包括 auth switch 重连、session close 快速失败、quota 后切换账号等。

回补 / 等价合入的 upstream PR

本 PR 主要回补了以下 upstream 变更:

  • #2492 fix(responses): reassemble split SSE event/data frames before streaming
  • #2474 fix(gemini): clean tool schemas, stream thinking passthrough, and strip trailing unanswered function calls
  • #2307 fix: preserve responses continuation state for openai-compatible upstreams
  • #2357 fix(proxy): fallback outbound HTTP clients to environment proxies
  • #2416 fix(gemini-cli): use backend project ID from onboarding response
  • #2272 fix(usage): ensure usage records are published for 0-token responses
  • #2153 fix(codex): resolve client_id and account_id handling in token refresh and file listing
  • #2210 fix(auth): stream bootstrap errors now trigger auth rotation
  • #2256 fix(websocket): Codex WebSocket 会话在额度耗尽后允许切换账号

此外,#2256 相关的后续 review / follow-up 修复也一并带入了这次整理结果,包括:

  • 切号后的增量污染与读通道竞态修复
  • 切号恢复时上下文快照截断修复
  • 会话关闭挂起修复
  • 失败轮次不提交会话快照
  • 401 解 pin 与相关可观测性补充

与 origin/main 的整合说明

因为本 PR 是直接基于 origin/main 重新整理的,所以冲突处理时优先保留了 origin/main 上已经存在的更新逻辑,再把本批回补叠上去。比较关键的一点是:

  • 保留了 origin/main 当前已有的 websocket tool repair 并发安全实现
  • 在此基础上叠加了本批 websocket/auth/pin reset/snapshot 管理修复

也就是说,这不是简单地把旧工作分支整段搬过来,而是一次面向 origin/main 的重整合。

Test Plan

  • go test ./... -count=1

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR backports and re-integrates a batch of upstream and local fixes onto origin/main, targeting OpenAI /v1/responses (SSE + websocket) correctness, Gemini/Amp compatibility, proxy/env behavior, usage publishing, and Codex auth/websocket robustness.

Changes:

  • Fix OpenAI Responses streaming behaviors: SSE framing/reassembly, HTTP continuation turn-state caching, and websocket pin reset + snapshot handling with added regression tests.
  • Improve executor/auth reliability: bootstrap retry behavior, auth rotation on stream bootstrap failures, ensure usage publication for 0-token responses, and Codex refresh metadata preservation.
  • Improve integration compatibility: Gemini tool schema cleaning + onboarding backend project ID preference; Amp response/request rewriting for signature/thinking handling; environment proxy fallback behavior.

Reviewed changes

Copilot reviewed 38 out of 39 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
sdk/cliproxy/auth/conductor.go Adjusts stream bootstrap failure handling across model pools (auth rotation-related behavior).
sdk/cliproxy/auth/conductor_stream_retry_test.go Adds tests covering auth rotation on bootstrap error / empty stream.
sdk/api/handlers/openai/openai_responses_websocket.go Adds websocket snapshot handling, pin reset on quota/auth errors, and incremental-mode gating after auth reset.
sdk/api/handlers/openai/openai_responses_websocket_test.go Expands websocket regression tests (pin reset, quota switch, status plumbing).
sdk/api/handlers/openai/openai_responses_turn_state.go Introduces in-memory TTL cache for /v1/responses continuation turn state.
sdk/api/handlers/openai/openai_responses_http_continuation_test.go Adds HTTP continuation tests validating cached-turn merge behavior.
sdk/api/handlers/openai/openai_responses_handlers.go Implements SSE framing/reassembly and caches completed turns from streaming/non-streaming responses.
sdk/api/handlers/openai/openai_responses_handlers_stream_test.go Adds unit tests for SSE framing edge-cases (split event/data, split JSON, flush).
sdk/api/handlers/openai/openai_responses_handlers_stream_error_test.go Updates streaming terminal error test to new forwarder signature.
sdk/api/handlers/handlers.go Refactors streaming bootstrap retry and wraps some immediate stream errors into stream results.
sdk/api/handlers/handlers_stream_bootstrap_test.go Adds coverage for split OpenAI Responses SSE event lines through the handler stream path.
internal/translator/gemini/claude/gemini_claude_request.go Cleans Gemini tool schemas, strips unsupported fields, and drops trailing unanswered model function calls.
internal/runtime/executor/qwen_executor.go Ensures usage records are published even for 0-token responses.
internal/runtime/executor/proxy_helpers.go Adds environment proxy fallback and transport caching for outbound HTTP clients.
internal/runtime/executor/proxy_helpers_test.go Adds tests for environment proxy fallback, precedence, and transport reuse.
internal/runtime/executor/openai_compat_executor.go Ensures usage records are published even for 0-token responses.
internal/runtime/executor/kimi_executor.go Ensures usage records are published even for 0-token responses.
internal/runtime/executor/iflow_executor.go Ensures usage records are published even for 0-token responses.
internal/runtime/executor/gemini_vertex_executor.go Ensures usage records are published even for 0-token responses.
internal/runtime/executor/gemini_executor.go Ensures usage records are published even for 0-token responses.
internal/runtime/executor/gemini_cli_executor.go Ensures usage records are published even for 0-token responses.
internal/runtime/executor/codex_websockets_executor.go Hardens Codex websocket session concurrency and reconnection behavior (auth switch, active channel ownership).
internal/runtime/executor/codex_websockets_executor_test.go Adds tests for read unblocking, auth-change reconnect, and session close fast-fail.
internal/runtime/executor/codex_executor.go Preserves/derives client_id on refresh and publishes usage on 0-token paths.
internal/runtime/executor/codex_executor_account_id_test.go Adds a manual integration-style test for account check / plan selection logic (skipped unless env set).
internal/runtime/executor/claude_executor.go Ensures usage records are published even for 0-token responses.
internal/runtime/executor/antigravity_executor.go Reuses a shared env-proxy HTTP/1.1 transport for Antigravity to avoid pool leaks/mutation.
internal/runtime/executor/antigravity_executor_client_test.go Tests Antigravity env-proxy transport sharing and HTTP/2 disabling.
internal/runtime/executor/aistudio_executor.go Ensures usage records are published even for 0-token responses.
internal/misc/header_utils.go Updates Gemini CLI UA version/string (adds terminal marker).
internal/cmd/login.go Prefers backend project ID from onboarding response.
internal/auth/codex/openai_auth.go Allows refresh token flow to accept an override client_id.
internal/auth/codex/openai_auth_test.go Updates tests for new refresh token API signature.
internal/auth/codex/jwt_parser.go Adds JWT helper to derive client_id from aud.
internal/api/modules/amp/response_rewriter.go Improves Amp response rewriting (signature injection, thinking suppression toggles, streaming heuristics) and adds request sanitizer.
internal/api/modules/amp/response_rewriter_test.go Adds tests for streaming thinking passthrough and request signature sanitization.
internal/api/modules/amp/fallback_handlers.go Applies request sanitizer and expands response rewriting to local-provider paths.
internal/api/handlers/management/auth_files.go Preserves Codex id_token-derived fields while allowing metadata overrides; aligns Gemini project ID selection behavior.
.gitignore Ignores .gocache/.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 690 to +724
buffered, closed, bootstrapErr := readStreamBootstrap(ctx, streamResult.Chunks)
if bootstrapErr != nil {
if errCtx := ctx.Err(); errCtx != nil {
discardStreamChunks(streamResult.Chunks)
return nil, errCtx
}
if isRequestInvalidError(bootstrapErr) {
rerr := &Error{Message: bootstrapErr.Error()}
if se, ok := errors.AsType[cliproxyexecutor.StatusError](bootstrapErr); ok && se != nil {
rerr.HTTPStatus = se.StatusCode()
}
result := Result{AuthID: auth.ID, Provider: provider, Model: resultModel, Success: false, Error: rerr}
result.RetryAfter = retryAfterFromError(bootstrapErr)
m.Hook().OnResult(ctx, result)
discardStreamChunks(streamResult.Chunks)
return nil, bootstrapErr
}
if idx < len(execModels)-1 {
rerr := &Error{Message: bootstrapErr.Error()}
if se, ok := errors.AsType[cliproxyexecutor.StatusError](bootstrapErr); ok && se != nil {
rerr.HTTPStatus = se.StatusCode()
}
result := Result{AuthID: auth.ID, Provider: provider, Model: resultModel, Success: false, Error: rerr}
result.RetryAfter = retryAfterFromError(bootstrapErr)
m.MarkResult(ctx, result)
discardStreamChunks(streamResult.Chunks)
lastErr = bootstrapErr
continue
}
rerr := &Error{Message: bootstrapErr.Error()}
if se, ok := errors.AsType[cliproxyexecutor.StatusError](bootstrapErr); ok && se != nil {
rerr.HTTPStatus = se.StatusCode()
}
result := Result{AuthID: auth.ID, Provider: provider, Model: resultModel, Success: false, Error: rerr}
result.RetryAfter = retryAfterFromError(bootstrapErr)
m.MarkResult(ctx, result)
discardStreamChunks(streamResult.Chunks)
return nil, newStreamBootstrapError(bootstrapErr, streamResult.Headers)
lastErr = bootstrapErr
continue
}

if closed && len(buffered) == 0 {
emptyErr := &Error{Code: "empty_stream", Message: "upstream stream closed before first payload", Retryable: true}
result := Result{AuthID: auth.ID, Provider: provider, Model: resultModel, Success: false, Error: emptyErr}
m.MarkResult(ctx, result)
if idx < len(execModels)-1 {
lastErr = emptyErr
continue
}
return nil, newStreamBootstrapError(emptyErr, streamResult.Headers)
lastErr = emptyErr
continue
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

executeStreamWithModelPool no longer wraps bootstrap/empty-stream failures with newStreamBootstrapError(...), which means executeStreamMixedOnce can no longer convert the final bootstrap failure into a StreamResult carrying upstream headers (streamErrorResult(bootstrapErr.Headers(), ...)). This looks like a regression for passthrough headers on bootstrap failures (e.g. Retry-After), and streamBootstrapError / newStreamBootstrapError now appear unused. Consider either (1) reintroducing the wrapper but make it also preserve StatusCode() by delegating to the underlying error, or (2) propagate streamResult.Headers through another channel/type so callers can keep both status code and headers.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preserved bootstrap headers without changing the Manager API contract. Final bootstrap and empty-stream failures now keep a streamBootstrapError wrapper with delegated StatusCode(), and handler-side immediate stream wrapping now carries Headers() into the synthetic StreamResult. This keeps ExecuteStream returning an error while still forwarding headers like Retry-After downstream.

Comment on lines +98 to +162
func environmentProxyConfigured() bool {
for _, key := range environmentProxyKeys {
if strings.TrimSpace(os.Getenv(key)) != "" {
return true
}
}
return false
}

func newEnvironmentProxyTransport() *http.Transport {
signature := environmentProxySignature()
if cached, ok := environmentProxyTransportCache.Load(signature); ok {
return cached.(*http.Transport)
}

proxyFunc := environmentProxyFunc()
var transport *http.Transport
if base, ok := http.DefaultTransport.(*http.Transport); ok && base != nil {
clone := base.Clone()
clone.Proxy = proxyFunc
transport = clone
} else {
transport = &http.Transport{Proxy: proxyFunc}
}
actual, _ := environmentProxyTransportCache.LoadOrStore(signature, transport)
return actual.(*http.Transport)
}

func environmentProxySignature() string {
var values []string
for _, key := range environmentProxyKeys {
values = append(values, key+"="+strings.TrimSpace(os.Getenv(key)))
}
return strings.Join(values, "|")
}

func environmentProxyFunc() func(*http.Request) (*url.URL, error) {
httpProxy := firstEnvironmentValue("HTTP_PROXY", "http_proxy")
httpsProxy := firstEnvironmentValue("HTTPS_PROXY", "https_proxy")
allProxy := firstEnvironmentValue("ALL_PROXY", "all_proxy")

return func(req *http.Request) (*url.URL, error) {
if req == nil || req.URL == nil {
return nil, nil
}

raw := ""
switch strings.ToLower(req.URL.Scheme) {
case "https":
raw = firstNonEmpty(httpsProxy, allProxy, httpProxy)
case "http":
raw = firstNonEmpty(httpProxy, allProxy, httpsProxy)
default:
raw = firstNonEmpty(allProxy, httpsProxy, httpProxy)
}
raw = strings.TrimSpace(raw)
if raw == "" {
return nil, nil
}
if !strings.Contains(raw, "://") {
raw = "http://" + raw
}
return url.Parse(raw)
}
}
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The custom environment proxy resolver (environmentProxyFunc) ignores standard NO_PROXY / no_proxy bypass rules, unlike http.ProxyFromEnvironment. This can inadvertently route requests that should be direct (e.g. localhost, in-cluster addresses) through the proxy, which is both a correctness and operational risk. Suggest using http.ProxyFromEnvironment directly for the env-proxy case, or implementing NO_PROXY matching and including it in the cache signature.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched the env-proxy fallback to golang.org/x/net/http/httpproxy.FromEnvironment().ProxyFunc() and included NO_PROXY / no_proxy in the transport cache signature. That keeps the transport reuse we wanted, while restoring the standard bypass behavior for localhost, cluster-local, and other no-proxy targets.

Comment on lines +134 to +170
func TestCodexAccountCheck(t *testing.T) {
accessToken := os.Getenv("CODEX_ACCESS_TOKEN")
if accessToken == "" {
t.Skip("skipping: CODEX_ACCESS_TOKEN not set")
}
proxyURL := os.Getenv("CODEX_PROXY_URL")
deviceID := uuid.NewString()
targetURL := "https://chatgpt.com/backend-api/accounts/check/v4-2023-04-27?timezone_offset_min=-480"

req, err := http.NewRequest(http.MethodGet, targetURL, nil)
if err != nil {
t.Fatalf("build request: %v", err)
}

req.Header.Set("accept", "*/*")
req.Header.Set("accept-language", "zh-HK,zh;q=0.9,en-US;q=0.8,en;q=0.7")
req.Header.Set("authorization", "Bearer "+strings.TrimSpace(accessToken))
req.Header.Set("oai-device-id", deviceID)
req.Header.Set("oai-language", "zh-HK")
req.Header.Set("referer", "https://chatgpt.com/")
req.Header.Set("user-agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36")
req.Header.Set("sec-ch-ua", `"Google Chrome";v="141", "Not?A_Brand";v="8", "Chromium";v="141"`)
req.Header.Set("sec-ch-ua-mobile", "?0")
req.Header.Set("sec-ch-ua-platform", `"macOS"`)
req.Header.Set("sec-fetch-dest", "empty")
req.Header.Set("sec-fetch-mode", "cors")
req.Header.Set("sec-fetch-site", "same-origin")
req.Header.Set("priority", "u=1, i")

client := &http.Client{
Transport: newUtlsTransport(proxyURL),
}

resp, err := client.Do(req)
if err != nil {
t.Fatalf("request failed: %v", err)
}
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test performs a real network call when CODEX_ACCESS_TOKEN is set, but the http.Client has no timeout and the request is not bound to a context deadline. If the upstream or proxy hangs, this can stall go test runs. Consider adding a client timeout / per-request context with deadline, and/or guarding the test behind an explicit build tag (e.g. //go:build integration) so it can't be enabled accidentally in automated environments.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardened this path in two ways: the live account-check test is now integration-only (//go:build integration), and it also uses both a request context deadline and an http.Client timeout. That keeps ordinary go test runs from triggering or hanging on this external call.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 38 out of 39 changed files in this pull request and generated 30 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +232 to +233
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensurePublished is deferred after trackFailure, so it runs first (LIFO). That publishes a success usage record even when err is non-nil, and then trackFailure cannot publish the failure record because usageReporter.once has already fired. Swap the defer order (defer ensurePublished first, then defer trackFailure) or gate ensurePublished on err == nil so failures are recorded correctly.

Suggested change
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
defer reporter.ensurePublished(ctx)
defer reporter.trackFailure(ctx, &err)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by centralizing the success/failure publish decision in usageReporter.finalize(ctx, &err) and switching the affected executors to a single defer. That removes the LIFO ordering hazard entirely instead of relying on per-call defer ordering. Fresh verification: go test ./... -count=1.

Comment on lines +76 to +77
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same defer-order issue as other executors in this PR: ensurePublished currently runs before trackFailure due to LIFO defers, which prevents failure usage records from being emitted when err is non-nil. Reverse the defers or make ensurePublished conditional on success.

Suggested change
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
defer reporter.ensurePublished(ctx)
defer reporter.trackFailure(ctx, &err)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by centralizing the success/failure publish decision in usageReporter.finalize(ctx, &err) and switching the affected executors to a single defer. That removes the LIFO ordering hazard entirely instead of relying on per-call defer ordering. Fresh verification: go test ./... -count=1.

Comment on lines +80 to +81
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensurePublished is deferred after trackFailure, which means it executes first and publishes a success usage record even on error; the subsequent trackFailure publish is suppressed by once.Do. Reverse defer order (or conditionally call ensurePublished only on success) to avoid misreporting failures.

Suggested change
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
defer reporter.ensurePublished(ctx)
defer reporter.trackFailure(ctx, &err)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by centralizing the success/failure publish decision in usageReporter.finalize(ctx, &err) and switching the affected executors to a single defer. That removes the LIFO ordering hazard entirely instead of relying on per-call defer ordering. Fresh verification: go test ./... -count=1.

Comment on lines +90 to +91
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added defer reporter.ensurePublished(ctx) runs before defer reporter.trackFailure(...) (LIFO), so failures will be published as success and the real failure record will be dropped by once.Do. Please reverse the defer order or guard ensurePublished behind err == nil.

Suggested change
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
defer reporter.ensurePublished(ctx)
defer reporter.trackFailure(ctx, &err)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by centralizing the success/failure publish decision in usageReporter.finalize(ctx, &err) and switching the affected executors to a single defer. That removes the LIFO ordering hazard entirely instead of relying on per-call defer ordering. Fresh verification: go test ./... -count=1.

Comment on lines +305 to +306
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensurePublished is deferred after trackFailure, so it executes first and can publish a success usage record even when the function returns an error; the subsequent failure publish is then blocked by once.Do. Reverse defer order (or gate ensurePublished on err == nil).

Suggested change
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
defer reporter.ensurePublished(ctx)
defer reporter.trackFailure(ctx, &err)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by centralizing the success/failure publish decision in usageReporter.finalize(ctx, &err) and switching the affected executors to a single defer. That removes the LIFO ordering hazard entirely instead of relying on per-call defer ordering. Fresh verification: go test ./... -count=1.

Comment on lines +357 to +358
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensurePublished is deferred after trackFailure, so it executes first and can publish success even when returning an error; failure usage publish is then suppressed by once.Do. Reverse defer order / guard ensurePublished.

Suggested change
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
defer reporter.ensurePublished(ctx)
defer reporter.trackFailure(ctx, &err)

Copilot uses AI. Check for mistakes.
Comment on lines +750 to +751
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same defer-order issue in streaming path: because ensurePublished is deferred after trackFailure, it runs first and can publish a success usage record even when err is set. Reverse the defers or gate ensurePublished behind err == nil.

Suggested change
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
defer reporter.ensurePublished(ctx)
defer reporter.trackFailure(ctx, &err)

Copilot uses AI. Check for mistakes.
Comment on lines +179 to +180
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streaming path has the same defer-order issue: ensurePublished runs before trackFailure (LIFO) and can prevent failure usage records from being emitted. Reverse defers or conditionally call ensurePublished.

Suggested change
defer reporter.trackFailure(ctx, &err)
defer reporter.ensurePublished(ctx)
defer reporter.ensurePublished(ctx)
defer reporter.trackFailure(ctx, &err)

Copilot uses AI. Check for mistakes.
Comment on lines +16 to +27
const responsesTurnStateTTL = 30 * time.Minute

type responsesTurnStateEntry struct {
request []byte
output []byte
expire time.Time
}

type responsesTurnStateCache struct {
entries sync.Map
}

Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

responsesTurnStateCache uses a sync.Map with TTL checks only on load(), but there is no periodic cleanup or size cap. Under high traffic (many unique response IDs), this can grow unbounded for up to responsesTurnStateTTL and beyond if entries are never reloaded, leading to memory pressure. Consider adding a background purge (similar to startClaudeDeviceProfileCacheCleanup in internal/runtime/executor/claude_device_profile.go) and/or enforcing a max entry count/LRU eviction.

Copilot uses AI. Check for mistakes.
Comment on lines +78 to 83
antigravityTransport *http.Transport
antigravityTransportOnce sync.Once
antigravityEnvironmentProxyTransport = sync.OnceValue(func() *http.Transport {
return cloneTransportWithHTTP11(newEnvironmentProxyTransport())
})
)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

antigravityEnvironmentProxyTransport is created via sync.OnceValue using the first observed environment proxy signature. If HTTP_PROXY/NO_PROXY change at runtime (or across tests), newEnvironmentProxyTransport() will return a different transport, but this code will still swap in the stale OnceValue transport, potentially using the wrong proxy/bypass rules. Consider caching the HTTP/1.1 clone keyed by the same environment signature (similar to environmentProxyTransportCache) instead of a single OnceValue instance.

Copilot uses AI. Check for mistakes.
@Arron196 Arron196 merged commit 3fb6e66 into main Apr 3, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants