Conversation
…overage - Add rate limit header support to retry interceptor - Parse Retry-After header (seconds and HTTP date formats) - Parse X-RateLimit-Reset header as fallback - New constructor: NewRetryWithRateLimitHeaders() - Safety: ignore values >24h, fallback to default delay - Add comprehensive test coverage - interceptors: 51.6% -> 93.6% - openai_compatible: 52.6% -> 91.2% - New test files: addheader_test.go, billing_test.go, coverage_test.go - 123 total tests, all passing - Update DESIGN.md with retry rate limit header docs
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 5 minutes and 55 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughAdded a retry constructor that uses rate-limit headers to derive inter-retry delays, updated Retry interceptor behavior and parsing, set an example to use the new constructor, modified request enricher header logic, and added extensive unit tests for interceptors and OpenAI-compatible provider parsing/resolution. Changes
🚥 Pre-merge checks | ✅ 1✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
providers/openai_compatible/parser_test.go (2)
313-334: AssertTotalTokensin the zero-usage case too.This fixture includes
total_tokens, but the test never verifies it. A regression inmeta.Usage.TotalTokenswould slip through while this case still passes.Suggested change
if meta.Usage.PromptTokens != 0 { t.Errorf("PromptTokens = %d, want 0", meta.Usage.PromptTokens) } if meta.Usage.CompletionTokens != 0 { t.Errorf("CompletionTokens = %d, want 0", meta.Usage.CompletionTokens) } + if meta.Usage.TotalTokens != 0 { + t.Errorf("TotalTokens = %d, want 0", meta.Usage.TotalTokens) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@providers/openai_compatible/parser_test.go` around lines 313 - 334, The test TestExtractor_ZeroUsage omits an assertion for meta.Usage.TotalTokens even though the fixture includes total_tokens; add an assertion in TestExtractor_ZeroUsage to verify meta.Usage.TotalTokens == 0 (similar to the existing checks for PromptTokens and CompletionTokens) so regressions to TotalTokens are caught—update the test in providers/openai_compatible/parser_test.go within the TestExtractor_ZeroUsage function to include this check.
189-193: Consider a small response helper for the extractor tests.These blocks all build the same
http.Responseshape. Pulling that into a tiny helper would reduce repetition and keep each test focused on the assertions that differ.Possible cleanup
func newJSONResponse(body string) *http.Response { return &http.Response{ StatusCode: 200, Header: make(http.Header), Body: io.NopCloser(strings.NewReader(body)), } }Then each test can just do:
resp := newJSONResponse(body)Also applies to: 245-249, 271-275, 297-301, 317-321, 339-343
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@providers/openai_compatible/parser_test.go` around lines 189 - 193, Add a small test helper newJSONResponse(body string) *http.Response in parser_test.go that returns &http.Response{StatusCode:200, Header:make(http.Header), Body:io.NopCloser(strings.NewReader(body))}, then replace each repeated construction of resp := &http.Response{...} in the extractor tests (the blocks that set StatusCode, Header and Body) with resp := newJSONResponse(body) to reduce duplication and keep tests focused on assertions.examples/basic/main.go (1)
234-234: Consider using the new header-aware retry constructor in this example.On Line 234,
NewRetryskips the newRetry-After/X-RateLimit-Resetbehavior added by this PR. For feature-aligned docs/example behavior, preferNewRetryWithRateLimitHeaders.💡 Suggested update
- llmproxy.WithInterceptor(interceptors.NewRetry(3, time.Millisecond*250)), + llmproxy.WithInterceptor(interceptors.NewRetryWithRateLimitHeaders(3, time.Millisecond*250)),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/basic/main.go` at line 234, Replace the old retry constructor with the header-aware variant: in the interceptor list where interceptors.NewRetry(3, time.Millisecond*250) is used (passed into llmproxy.WithInterceptor), call interceptors.NewRetryWithRateLimitHeaders(3, time.Millisecond*250) instead so the example respects Retry-After / X-RateLimit-Reset header behavior.interceptors/billing_test.go (1)
61-67: Use tolerance for floating-point cost assertion.On Line 65, exact equality for
float64can make this test fragile. Prefer epsilon-based comparison.💡 Suggested fix
import ( + "math" "net/http" "net/http/httptest" "testing" @@ - if result.TotalCost != expectedTotal { + if math.Abs(result.TotalCost-expectedTotal) > 1e-12 { t.Errorf("TotalCost = %f, want %f", result.TotalCost, expectedTotal) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interceptors/billing_test.go` around lines 61 - 67, The test currently asserts exact equality between result.TotalCost and expectedTotal (using expectedInputCost and expectedOutputCost), which is fragile for float64; replace the direct equality check with an epsilon-based comparison: compute a small tolerance (e.g. epsilon := 1e-9 or 1e-6) and fail the test only if math.Abs(result.TotalCost-expectedTotal) > epsilon, using t.Errorf (or t.Fatalf) to report the mismatch and include both values for diagnostics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@interceptors/addheader_test.go`:
- Around line 32-45: After each successful call to add.Intercept in the tests
(the resp variable returned by add.Intercept), ensure the response body is
closed to avoid leaking HTTP connections: immediately after checking err == nil
add a defer resp.Body.Close() (or explicitly drain and close resp.Body) so the
body is always closed for the success paths in the test cases that exercise
add.Intercept (the resp variable used on lines where headers are asserted).
In `@interceptors/coverage_test.go`:
- Around line 305-339: TestRetryInterceptor_RetryAfterHeader never captures the
Retry-After value (retryAfter is never assigned) so the elapsed check is
comparing against zero and permits false positives; update the test to (1) read
and parse the Retry-After header returned by the upstream server into retryAfter
(e.g., in the upstream handler or by extracting it from the first response in
the next function) and fail the test if parsing fails or value is zero, (2) call
retry.Intercept (via NewRetryWithRateLimitHeaders) and assert elapsed is at
least retryAfter seconds and no more than retryAfter+smallTolerance (e.g.,
100-200ms) to avoid flakiness, and (3) keep references to
TestRetryInterceptor_RetryAfterHeader, NewRetryWithRateLimitHeaders, and
Intercept when locating where to change the test logic; apply the same pattern
to the other similar tests covering lines mentioned.
In `@interceptors/retry.go`:
- Around line 84-85: The boundary check currently rejects exactly 24h because
conditions like "seconds > 0 && seconds < 86400" exclude 86400; update those
comparisons to include 86400 (e.g., change to "seconds > 0 && seconds <= 86400")
so values equal to 24h are accepted; make the same change for the other
occurrence(s) of the "< 86400" check (referencing the "seconds" variable and the
existing conditional expressions in interceptors/retry.go).
---
Nitpick comments:
In `@examples/basic/main.go`:
- Line 234: Replace the old retry constructor with the header-aware variant: in
the interceptor list where interceptors.NewRetry(3, time.Millisecond*250) is
used (passed into llmproxy.WithInterceptor), call
interceptors.NewRetryWithRateLimitHeaders(3, time.Millisecond*250) instead so
the example respects Retry-After / X-RateLimit-Reset header behavior.
In `@interceptors/billing_test.go`:
- Around line 61-67: The test currently asserts exact equality between
result.TotalCost and expectedTotal (using expectedInputCost and
expectedOutputCost), which is fragile for float64; replace the direct equality
check with an epsilon-based comparison: compute a small tolerance (e.g. epsilon
:= 1e-9 or 1e-6) and fail the test only if
math.Abs(result.TotalCost-expectedTotal) > epsilon, using t.Errorf (or t.Fatalf)
to report the mismatch and include both values for diagnostics.
In `@providers/openai_compatible/parser_test.go`:
- Around line 313-334: The test TestExtractor_ZeroUsage omits an assertion for
meta.Usage.TotalTokens even though the fixture includes total_tokens; add an
assertion in TestExtractor_ZeroUsage to verify meta.Usage.TotalTokens == 0
(similar to the existing checks for PromptTokens and CompletionTokens) so
regressions to TotalTokens are caught—update the test in
providers/openai_compatible/parser_test.go within the TestExtractor_ZeroUsage
function to include this check.
- Around line 189-193: Add a small test helper newJSONResponse(body string)
*http.Response in parser_test.go that returns &http.Response{StatusCode:200,
Header:make(http.Header), Body:io.NopCloser(strings.NewReader(body))}, then
replace each repeated construction of resp := &http.Response{...} in the
extractor tests (the blocks that set StatusCode, Header and Body) with resp :=
newJSONResponse(body) to reduce duplication and keep tests focused on
assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 271c11df-7132-4ab7-b559-8c49173af552
📒 Files selected for processing (7)
DESIGN.mdexamples/basic/main.gointerceptors/addheader_test.gointerceptors/billing_test.gointerceptors/coverage_test.gointerceptors/retry.goproviders/openai_compatible/parser_test.go
📜 Review details
🔇 Additional comments (1)
DESIGN.md (1)
317-331: Nice addition—retry header semantics are clearly documented.The constructor purpose, precedence rules, and example are clear and easy to apply.
Inline comment fixes: - addheader_test.go: Close response body to prevent connection leaks - coverage_test.go: Fix flaky RetryAfter tests with proper timing tolerances - retry.go: Accept exactly 24h (86400s) in boundary checks, not just < 24h Nitpick fixes: - examples/basic/main.go: Use NewRetryWithRateLimitHeaders in example - billing_test.go: Use epsilon-based float comparison for TotalCost - parser_test.go: Add TotalTokens assertion in TestExtractor_ZeroUsage - Add test for exactly 24h boundary case
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
interceptors/retry.go (1)
51-54:⚠️ Potential issue | 🟠 MajorClose intermediate retry responses before the next attempt.
On Line 52 retryable paths continue without closing
lastResp.Body, which can leak connections under sustained 429/5xx retries.💡 Proposed fix
lastResp, lastMeta, lastRawRespBody, lastErr = next(req) if !isRetryable(lastResp, lastErr) { return lastResp, lastMeta, lastRawRespBody, lastErr } + if attempt < i.MaxAttempts && lastResp != nil && lastResp.Body != nil { + _ = lastResp.Body.Close() + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interceptors/retry.go` around lines 51 - 54, The retry loop is leaking response bodies when isRetryable(lastResp, lastErr) returns true; before proceeding to the next attempt in the path that calls next(req) again, ensure you drain and close the previous response body (check lastResp != nil and lastResp.Body != nil, call io.Copy(io.Discard, lastResp.Body) then lastResp.Body.Close()) so connections are returned to the pool; add this cleanup right after lastResp, lastMeta, lastRawRespBody, lastErr = next(req) and before the isRetryable check/continuation in the retry logic that uses next(req).
♻️ Duplicate comments (1)
interceptors/addheader_test.go (1)
73-76:⚠️ Potential issue | 🟡 MinorClose response bodies in the remaining success paths.
Line 73 and Line 142 still ignore the returned
resp; on success, this leavesresp.Bodyunclosed and can leak test HTTP connections.Proposed fix
- _, _, _, err := add.Intercept(req, llmproxy.BodyMetadata{}, nil, next) + resp, _, _, err := add.Intercept(req, llmproxy.BodyMetadata{}, nil, next) if err != nil { t.Fatalf("Intercept returned error: %v", err) } + defer resp.Body.Close()- _, _, _, err := add.Intercept(req, llmproxy.BodyMetadata{}, nil, next) + resp, _, _, err := add.Intercept(req, llmproxy.BodyMetadata{}, nil, next) if err != nil { t.Fatalf("Intercept returned error: %v", err) } + defer resp.Body.Close()Also applies to: 142-145
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interceptors/addheader_test.go` around lines 73 - 76, The test currently ignores the returned resp from add.Intercept which can leak HTTP connections; after the call to add.Intercept (the places around the uses at lines where the test asserts err == nil) check that resp is non-nil and close its Body (e.g., call resp.Body.Close() or defer resp.Body.Close() immediately after the nil error check) so both success paths (the call near add.Intercept and the other call later) properly release the response body.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@interceptors/retry.go`:
- Around line 75-78: The parsing currently assigns
resp.Header.Get("Retry-After") or X-RateLimit-Reset into retryAfter and then
tries to parse it, which prevents falling back when Retry-After is present but
malformed; update the logic in the function (where retryAfter is set/parsed in
interceptors/retry.go) to first read Retry-After and attempt both numeric and
HTTP-date parsing on that value (handle strconv.Atoi and time.Parse with
time.RFC1123/RFC1123Z/etc.), and only if those parses fail or the header is
absent then read and parse X-RateLimit-Reset; ensure the code path returns the
correct parsed duration/epoch and does not treat a malformed Retry-After as a
successful fallback candidate.
In `@providers/openai_compatible/parser_test.go`:
- Around line 445-454: The test TestProvider_NewWithProvider only checks
provider.Name(); update it to also assert that the supplied base provider's
injected component is preserved by NewWithProvider: when creating base with
llmproxy.NewBaseProvider("custom", llmproxy.WithBodyParser(&Parser{})), verify
that the resulting provider returned by NewWithProvider("custom", base) retains
that same body parser instance (or at least an equal parser) rather than a fresh
default—i.e., compare the provider's body parser to the original base's body
parser via the same accessor used elsewhere (e.g., BodyParser or GetBodyParser)
so the wiring contract is actually enforced.
- Around line 170-182: The test currently expects an Authorization header with
an empty bearer token; update behavior so Enricher does not set the
Authorization header when created with an empty API key. Change Enrich (method
on the Enricher returned by NewEnricher) to conditionally add the
"Authorization: Bearer <key>" header only when the key is non-empty, and update
the TestEnricher_EmptyKey expectation to assert that the Authorization header is
absent (empty Get should return ""). Locate references to NewEnricher and Enrich
in providers/openai_compatible/enricher.go and adjust the header write logic
accordingly.
---
Outside diff comments:
In `@interceptors/retry.go`:
- Around line 51-54: The retry loop is leaking response bodies when
isRetryable(lastResp, lastErr) returns true; before proceeding to the next
attempt in the path that calls next(req) again, ensure you drain and close the
previous response body (check lastResp != nil and lastResp.Body != nil, call
io.Copy(io.Discard, lastResp.Body) then lastResp.Body.Close()) so connections
are returned to the pool; add this cleanup right after lastResp, lastMeta,
lastRawRespBody, lastErr = next(req) and before the isRetryable
check/continuation in the retry logic that uses next(req).
---
Duplicate comments:
In `@interceptors/addheader_test.go`:
- Around line 73-76: The test currently ignores the returned resp from
add.Intercept which can leak HTTP connections; after the call to add.Intercept
(the places around the uses at lines where the test asserts err == nil) check
that resp is non-nil and close its Body (e.g., call resp.Body.Close() or defer
resp.Body.Close() immediately after the nil error check) so both success paths
(the call near add.Intercept and the other call later) properly release the
response body.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 63799305-12cc-4cc3-8cba-665f656eb48d
📒 Files selected for processing (7)
DESIGN.mdexamples/basic/main.gointerceptors/addheader_test.gointerceptors/billing_test.gointerceptors/coverage_test.gointerceptors/retry.goproviders/openai_compatible/parser_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
- DESIGN.md
- interceptors/billing_test.go
- interceptors/coverage_test.go
📜 Review details
🔇 Additional comments (2)
interceptors/retry.go (1)
112-117: Nice addition of an opt-in constructor.
NewRetryWithRateLimitHeaderscleanly enables header-aware retries without changing existing constructor behavior.examples/basic/main.go (1)
234-234: Example wiring is aligned with the new retry behavior.Good update to demonstrate
NewRetryWithRateLimitHeadersin the primary OpenAI-compatible path.
Inline comment fixes: - retry.go: Fix parseRetryAfterHeader to fall back to X-RateLimit-Reset when Retry-After is malformed - retry.go: Drain and close response body in retry loop to prevent connection leaks - enricher.go: Don't set Authorization header when API key is empty - parser_test.go: Verify body parser is preserved in TestProvider_NewWithProvider - parser_test.go: Update TestEnricher_EmptyKey to expect no Authorization header - coverage_test.go: Add test for malformed Retry-After fallback behavior Duplicate comment fixes: - addheader_test.go: Close response body in tests that discard it
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@interceptors/addheader_test.go`:
- Around line 50-53: The test assigns capturedReq in the upstream handler
goroutine and reads it in the test goroutine causing a data race; update both
TestAddHeaderInterceptor_RequestHeaders and TestAddHeaderInterceptor_Both to
synchronize by creating a buffered channel (e.g., reqCh := make(chan
*http.Request, 1), send the request pointer from the handler via reqCh <- r, and
receive it in the test before inspecting headers with capturedReq := <-reqCh) so
the handler-to-test transfer is properly synchronized instead of unsafely
reading capturedReq across goroutines.
In `@providers/openai_compatible/enricher.go`:
- Around line 20-24: The Enricher.Enrich method currently skips setting
Authorization when APIKey is empty but does not remove any existing header,
allowing inbound Authorization to be forwarded; change the empty-key branch in
Enricher.Enrich to call req.Header.Del("Authorization") to explicitly remove the
header, and update the unit test TestEnricher_EmptyKey to pre-populate an
Authorization header on the request before calling Enrich and assert that it was
deleted (no Authorization present) after Enrich returns.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f4226a45-9807-4dd6-bb5d-5fa5f40b35ba
📒 Files selected for processing (5)
interceptors/addheader_test.gointerceptors/coverage_test.gointerceptors/retry.goproviders/openai_compatible/enricher.goproviders/openai_compatible/parser_test.go
✅ Files skipped from review due to trivial changes (1)
- interceptors/coverage_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- interceptors/retry.go
📜 Review details
🔇 Additional comments (7)
interceptors/addheader_test.go (2)
11-47: Strong response-header behavior coverage.This test cleanly verifies injected response headers and preservation of upstream
Content-Type, and it now safely closes the response body.
150-171: Good negative-path and constructor validation.The error passthrough assertions and
NewHeaderfield checks are clear and scoped well.Also applies to: 173-181
providers/openai_compatible/parser_test.go (5)
14-97: Good parser contract coverage.The focused cases for full-field parsing plus known-vs-custom field separation should make parser regressions much easier to localize.
99-151: Nice payload-fidelity edge cases.Locking down empty JSON, invalid JSON, multiline text, and Unicode content is exactly the kind of coverage this parser needed.
188-355: Extractor coverage looks solid.The mix of message, delta, empty choices, zero-usage, and malformed JSON cases gives this layer a much safer test surface.
357-416: Resolver scenarios are well targeted.Base URL handling, custom prefixes, invalid input, and trailing-slash normalization cover the main resolver failure modes.
418-494: Good constructor and wiring assertions.The non-nil component checks plus the stronger
NewWithProviderbody-parser assertion now exercise the provider contract much better.
- addheader_test.go: Fix data race in TestAddHeaderInterceptor_RequestHeaders and TestAddHeaderInterceptor_Both by using buffered channels for goroutine synchronization instead of shared variables - enricher.go: Delete Authorization header when API key is empty to prevent forwarding inbound auth tokens to upstream providers - parser_test.go: Update TestEnricher_EmptyKey to verify that existing Authorization header is removed when enricher has empty key
Summary
Changes
Retry Interceptor Enhancements
Retry-Afterheader (both seconds and HTTP date formats)X-RateLimit-Resetheader as fallbackNewRetryWithRateLimitHeaders(maxAttempts, defaultDelay)Test Coverage Improvements
New test files:
interceptors/addheader_test.go— 6 testsinterceptors/billing_test.go— 4 testsinterceptors/coverage_test.go— 16 tests (metrics, retry, logging, LoggerFunc)Enhanced tests:
providers/openai_compatible/parser_test.go— expanded from 4 to 23 testsTotal: 123 tests, all passing
Documentation
DESIGN.mdwith retry rate limit header documentationUsage Example
Test Plan
Summary by CodeRabbit
New Features
Documentation
Bug Fixes
Tests