Skip to content

Conversation

@FammasMaz
Copy link

@FammasMaz FammasMaz commented Dec 19, 2025

Important

Adds Anthropic endpoint for Claude code with format translation and updates to Antigravity provider for compatibility.

  • Endpoints:
    • Adds /v1/messages and /v1/messages/count_tokens endpoints in main.py for Anthropic API compatibility.
    • Implements anthropic_messages and anthropic_count_tokens functions to handle requests and responses in Anthropic format.
  • Models:
    • Introduces AnthropicTextBlock, AnthropicImageSource, AnthropicImageBlock, AnthropicToolUseBlock, AnthropicToolResultBlock, AnthropicMessage, AnthropicTool, AnthropicThinkingConfig, AnthropicMessagesRequest, AnthropicUsage, AnthropicMessagesResponse, AnthropicCountTokensRequest, and AnthropicCountTokensResponse in main.py.
  • Format Translation:
    • Adds functions anthropic_to_openai_messages, anthropic_to_openai_tools, anthropic_to_openai_tool_choice, and openai_to_anthropic_response in main.py for format conversion.
  • Streaming:
    • Implements anthropic_streaming_wrapper in main.py to convert OpenAI streaming format to Anthropic streaming format.
  • Antigravity Provider:
    • Updates _transform_to_antigravity_format in antigravity_provider.py to ensure max_tokens is greater than thinking.budget_tokens for Claude models.

This description was created by Ellipsis for aef492b. You can customize this summary. It will automatically update as commits are pushed.

Copilot AI review requested due to automatic review settings December 19, 2025 14:38
@mirrobot-agent
Copy link

Starting my review of the Anthropic endpoint implementation. This is a significant addition bringing Claude Code compatibility via format translation between Anthropic and OpenAI APIs. Analyzing the endpoint handlers, model definitions, and streaming wrapper now.

Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall Assessment

This PR delivers a well-structured Anthropic API compatibility layer, enabling Claude Code and other Anthropic clients to use the proxy via format translation. The implementation demonstrates solid understanding of both API formats and handles edge cases thoughtfully (e.g., forcing high thinking for Opus models, defensive error handling in streaming).

Verdict: The code quality is good with comprehensive error handling and proper Anthropic error format responses. There are two minor improvements suggested inline, but neither is blocking.

Architectural Feedback

The approach of translating Anthropic format → OpenAI format → processing → translating back is sound and aligns with the existing proxy architecture. The streaming wrapper correctly maintains state across the SSE event lifecycle, and the block index tracking for tool calls is well-implemented.

One architectural note: The code adds ~940 lines to main.py, which is already substantial. Consider whether these Anthropic-specific handlers could be extracted to a separate module (e.g., anthropic_compat.py) in a future refactor to improve maintainability.

Key Suggestions

  1. Tool ID collision risk (inline comment): The fallback int(time.time()) for tool IDs could collide with parallel calls. Using UUID is recommended.

  2. Unused state variables (inline comment): accumulated_text and accumulated_thinking in the streaming wrapper are tracked but never used.

Minor Observations (Not Inline)

  • Token counting for tools: The count_tokens endpoint serializes tool definitions to JSON text and counts that. This is an approximation since tokenizers may handle function schemas differently. Consider documenting this limitation.

  • Reasoning effort thresholds: The budget_tokens → reasoning_effort mapping (32000/10000/5000) works but lacks inline documentation explaining the rationale.

  • Testing: No tests are included in this PR for the new endpoints. Consider adding test coverage for the format translation functions and endpoint handlers.

Questions for the Author

  1. Are the accumulated_text and accumulated_thinking variables in the streaming wrapper intended for future use (e.g., logging), or are they leftover from development?

  2. Is there a plan to add integration tests for the Anthropic endpoint compatibility?

This review was generated by an AI assistant.

content_blocks.append(
{
"type": "tool_use",
"id": tc.get("id", f"toolu_{int(time.time())}"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using int(time.time()) as a fallback tool ID could cause collisions if multiple tool calls are processed within the same second. Since uuid is already imported in this file, consider using a UUID for more robust uniqueness.

Suggested change
"id": tc.get("id", f"toolu_{int(time.time())}"),
"id": tc.get("id", f"toolu_{uuid.uuid4().hex[:12]}"),

thinking_block_started = False
current_block_index = 0
accumulated_text = ""
accumulated_thinking = ""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variables accumulated_text and accumulated_thinking are initialized and updated throughout the streaming wrapper but are never read. If these are intended for debugging or future use (e.g., logging the complete response), consider adding a comment explaining their purpose. Otherwise, removing them would reduce cognitive overhead.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Anthropic API endpoint support to enable compatibility with Claude Code and other Anthropic API clients. The implementation translates between Anthropic's message format and the existing OpenAI-compatible format used internally.

Key changes:

  • Added Anthropic Messages API endpoint (/v1/messages) with streaming and non-streaming support
  • Implemented format translation between Anthropic and OpenAI request/response structures
  • Added Claude extended thinking parameter validation to ensure max_tokens > thinking_budget

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 14 comments.

File Description
src/rotator_library/providers/antigravity_provider.py Adds validation logic to ensure max_tokens exceeds thinking_budget for Claude models with extended thinking enabled
src/proxy_app/main.py Implements Anthropic Messages API endpoint with request/response translation, streaming support, token counting endpoint, and dual authentication (x-api-key and Bearer token)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

anthropic_usage["cache_read_input_tokens"] = details["cached_tokens"]

return {
"id": openai_response.get("id", f"msg_{int(time.time())}"),
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using time.time() as a fallback for message IDs can generate duplicate IDs if multiple messages are processed within the same second. Consider using uuid or a more robust unique ID generation strategy for the fallback case, similar to request_id generation at line 1672.

Copilot uses AI. Check for mistakes.
Comment on lines +2658 to +3542
# CRITICAL: For Claude with extended thinking, max_tokens MUST be > thinking.budget_tokens
# Per Claude docs: https://docs.claude.com/en/docs/build-with-claude/extended-thinking
# If this constraint is violated, the API returns 400 INVALID_ARGUMENT
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment references Claude documentation, but the logic is being applied to all Claude models via the antigravity provider. Consider clarifying which specific models or model versions this constraint applies to, as extended thinking may not be available for all Claude models.

Suggested change
# CRITICAL: For Claude with extended thinking, max_tokens MUST be > thinking.budget_tokens
# Per Claude docs: https://docs.claude.com/en/docs/build-with-claude/extended-thinking
# If this constraint is violated, the API returns 400 INVALID_ARGUMENT
# CRITICAL: For Claude models that support extended thinking (e.g., Sonnet 4.5 and Opus 4.5
# via the Antigravity proxy), max_tokens MUST be > thinking.budget_tokens when
# extended thinking is enabled. Per Claude docs:
# https://docs.claude.com/en/docs/build-with-claude/extended-thinking
# If this constraint is violated for those models, the API can return 400 INVALID_ARGUMENT

Copilot uses AI. Check for mistakes.
Comment on lines +813 to +694
# Check x-api-key first (Anthropic style)
if x_api_key and x_api_key == PROXY_API_KEY:
return x_api_key
# Fall back to Bearer token (OpenAI style)
if auth and auth == f"Bearer {PROXY_API_KEY}":
return auth
raise HTTPException(status_code=401, detail="Invalid or missing API Key")
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When PROXY_API_KEY is not set or empty (open access mode), this function will always raise an HTTPException because neither condition will match. This is inconsistent with verify_api_key at line 794 which allows access when PROXY_API_KEY is not set. Consider adding a check similar to line 794 to allow open access mode.

Copilot uses AI. Check for mistakes.
Comment on lines 1727 to 1732
elif "opus" in body.model.lower():
# Force high thinking for Opus models when no thinking config is provided
# Opus 4.5 always uses the -thinking variant, so we want maximum thinking budget
# Without this, the backend defaults to thinkingBudget: -1 (auto) instead of high
openai_request["reasoning_effort"] = "high"
openai_request["custom_reasoning_budget"] = True
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The model name matching logic uses a simple substring check ("opus" in body.model.lower()), which could match unintended model names. Consider using a more specific pattern or a list of known Opus model names to avoid false positives with models that might contain "opus" in their name but aren't Claude Opus models.

Copilot uses AI. Check for mistakes.
Comment on lines 1691 to 1708
# Build OpenAI-compatible request
openai_request = {
"model": body.model,
"messages": openai_messages,
"max_tokens": body.max_tokens,
"stream": body.stream or False,
}

if body.temperature is not None:
openai_request["temperature"] = body.temperature
if body.top_p is not None:
openai_request["top_p"] = body.top_p
if body.stop_sequences:
openai_request["stop"] = body.stop_sequences
if openai_tools:
openai_request["tools"] = openai_tools
if openai_tool_choice:
openai_request["tool_choice"] = openai_tool_choice
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The top_k parameter from the Anthropic request (defined in AnthropicMessagesRequest at line 301) is not being mapped to the OpenAI request. Anthropic's top_k parameter controls sampling diversity and should be included if provided. Consider adding a check similar to lines 1699-1702 to include top_k in the openai_request when it's not None.

Copilot uses AI. Check for mistakes.
openai_request["tools"] = openai_tools
if openai_tool_choice:
openai_request["tool_choice"] = openai_tool_choice

Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata parameter from the Anthropic request (defined in AnthropicMessagesRequest at line 306) is not being mapped or handled. While metadata may not directly map to OpenAI format, it could be useful for logging or tracking purposes. Consider whether this parameter should be preserved for request logging or passed through as custom fields.

Suggested change
# Preserve Anthropic metadata for logging/tracking or downstream use
metadata = anthropic_request.get("metadata")
if metadata is not None:
openai_request["metadata"] = metadata

Copilot uses AI. Check for mistakes.
Comment on lines +2673 to +3560
if current_max_tokens <= thinking_budget:
lib_logger.warning(
f"max_tokens ({current_max_tokens}) must be > thinkingBudget ({thinking_budget}). "
f"Adjusting to {min_required_tokens}"
)
gen_config["maxOutputTokens"] = min_required_tokens
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition checks if current_max_tokens <= thinking_budget, but the adjustment only occurs in this case. However, the comment states that max_tokens MUST be > thinking_budget. If current_max_tokens is between thinking_budget + 1 and thinking_budget + 1024 (exclusive), it satisfies the API requirement (> thinking_budget) but might not provide enough buffer for the actual response content. Consider changing the condition to check if current_max_tokens < min_required_tokens instead.

Copilot uses AI. Check for mistakes.
Comment on lines 1128 to 1133
# Close any open thinking block
if thinking_block_started:
yield f'event: content_block_stop\ndata: {{"type": "content_block_stop", "index": {current_block_index}}}\n\n'
current_block_index += 1
thinking_block_started = False

Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable thinking_block_started is not used.

Suggested change
# Close any open thinking block
if thinking_block_started:
yield f'event: content_block_stop\ndata: {{"type": "content_block_stop", "index": {current_block_index}}}\n\n'
current_block_index += 1
thinking_block_started = False

Copilot uses AI. Check for mistakes.
if content_block_started:
yield f'event: content_block_stop\ndata: {{"type": "content_block_stop", "index": {current_block_index}}}\n\n'
current_block_index += 1
content_block_started = False
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable content_block_started is not used.

Suggested change
content_block_started = False

Copilot uses AI. Check for mistakes.
continue

delta = choices[0].get("delta", {})
finish_reason = choices[0].get("finish_reason")
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable finish_reason is not used.

Suggested change
finish_reason = choices[0].get("finish_reason")

Copilot uses AI. Check for mistakes.
@Mirrowel
Copy link
Owner

Problem I noticed from the start - only files edited are antigravity and main.py. main.py does not do anything by itself - logic should be in client.py or helper files called by it. Tldr the logic should be in the library.
Though, I can make the changes myself If you don't mind.

@FammasMaz
Copy link
Author

FammasMaz commented Dec 19, 2025

Sure, go ahead

Edit: oh you want me to make these changes, maybe i misunderstood?

@Mirrowel
Copy link
Owner

Sure, go ahead

Edit: oh you want me to make these changes, maybe i misunderstood?

Preferably you, but it can go either way. I didn't set up any guidance for issues or PRs, contributing guidelines, etc. - so it is technically my fault you didn't do that.

Not sure when I'll get around to this, but soon-ish I'll review and work on making this be a part of the library proper.

This is full on anthropic translation layer, right? If it works like the bot says - it should be compatible with all providers automatically.

@Mirrowel Mirrowel added enhancement New feature or request Agent Monitored Monitored for AI Agent to review PR's and commits Priority labels Dec 19, 2025
@Mirrowel
Copy link
Owner

@mirrobot-agent what will it take to implement what I said? To move it to the library and make it a separate method for anthropic endpoint proper. Converting to and from anthropic to openai, and from opeanai any existing provider can work as normal.

@mirrobot-agent
Copy link

@Mirrowel, I'm analyzing the codebase to provide a detailed plan for moving the Anthropic translation layer from main.py into the rotator_library. I'll examine the current structure and outline what changes would be needed to make this a proper library feature.

Give me a moment to review the architecture...

@mirrobot-agent
Copy link

@Mirrowel, I've completed my analysis of the codebase. Here's a detailed plan for moving the Anthropic translation layer from main.py into the rotator_library.


Summary

The goal is to extract the ~700 lines of Anthropic format translation code from main.py into the library so that:

  1. Any existing provider can work with Anthropic-formatted requests automatically
  2. The translation is a reusable component, not tied to the FastAPI proxy
  3. The library maintains its OpenAI-centric internal format while supporting Anthropic clients

What Needs to Move

From main.py, the following components would be extracted:

1. Pydantic Models (~130 lines, lines 218-348)

  • AnthropicTextBlock, AnthropicImageSource, AnthropicImageBlock
  • AnthropicToolUseBlock, AnthropicToolResultBlock, AnthropicMessage
  • AnthropicTool, AnthropicThinkingConfig, AnthropicMessagesRequest
  • AnthropicUsage, AnthropicMessagesResponse
  • AnthropicCountTokensRequest, AnthropicCountTokensResponse

2. Format Translation Functions (~200 lines, lines 822-1068)

  • anthropic_to_openai_messages() - Converts Anthropic message format to OpenAI
  • anthropic_to_openai_tools() - Converts Anthropic tool definitions
  • anthropic_to_openai_tool_choice() - Converts tool_choice parameter
  • openai_to_anthropic_response() - Converts OpenAI responses back to Anthropic format

3. Streaming Wrapper (~280 lines, lines 1071-1351)

  • anthropic_streaming_wrapper() - Converts OpenAI SSE stream to Anthropic SSE format
  • Handles: message_start, content_block_start, content_block_delta, content_block_stop, message_delta, message_stop

Proposed Library Architecture

New Files to Create

src/rotator_library/
├── anthropic_compat/
│   ├── __init__.py              # Exports main classes
│   ├── models.py                # Pydantic models for Anthropic format
│   ├── translator.py            # Format translation functions
│   └── streaming.py             # Streaming format converter

Integration Point: RotatingClient

Add new methods to client.py:

class RotatingClient:
    # ... existing code ...
    
    async def anthropic_messages(
        self,
        request: AnthropicMessagesRequest,
        raw_request: Optional[Any] = None  # For disconnect checks
    ) -> Union[AnthropicMessagesResponse, AsyncGenerator]:
        """
        Handle Anthropic Messages API requests.
        
        Internally:
        1. Translates Anthropic request → OpenAI format
        2. Calls existing acompletion() with OpenAI format
        3. Translates OpenAI response → Anthropic format
        """
        # Translate request
        openai_request = anthropic_to_openai_request(request)
        
        # Call existing OpenAI-compatible method
        if request.stream:
            response_stream = self.acompletion(**openai_request)
            return anthropic_streaming_wrapper(response_stream, request.model)
        else:
            response = await self.acompletion(**openai_request)
            return openai_to_anthropic_response(response, request.model)
    
    async def anthropic_count_tokens(
        self,
        request: AnthropicCountTokensRequest
    ) -> AnthropicCountTokensResponse:
        """Handle Anthropic count_tokens API requests."""
        # Translate and use existing token_count()
        ...

Implementation Steps

Step 1: Create anthropic_compat/models.py

  • Move all Pydantic model definitions
  • Keep them identical to maintain compatibility

Step 2: Create anthropic_compat/translator.py

  • Move all anthropic_to_openai_* and openai_to_anthropic_* functions
  • Add a high-level translate_request() function that handles the full conversion

Step 3: Create anthropic_compat/streaming.py

  • Move anthropic_streaming_wrapper()
  • Refactor to be framework-agnostic (no FastAPI Request dependency)
  • Accept a generic is_disconnected callback for client disconnect checks

Step 4: Update client.py

  • Add anthropic_messages() method
  • Add anthropic_count_tokens() method
  • Import from anthropic_compat module

Step 5: Update __init__.py

  • Export new Anthropic classes for external use

Step 6: Simplify main.py

  • Replace ~700 lines with ~50 lines calling client.anthropic_messages()
  • Keep only FastAPI endpoint definitions and error handling

Benefits

  1. Reusability: The translation layer can be used by any application using rotator_library, not just the proxy
  2. Maintainability: Anthropic-specific code is isolated in its own module
  3. Testability: Format translation can be unit tested independently
  4. Extensibility: Easy to add support for other API formats (e.g., Bedrock, Vertex) following the same pattern

Effort Estimate

Component Lines of Code Complexity
Create models.py ~130 Low (copy/move)
Create translator.py ~200 Medium (refactor for reuse)
Create streaming.py ~280 Medium (remove FastAPI deps)
Update client.py ~80 Medium (new methods)
Simplify main.py -650 / +50 Low (remove/replace)
Total ~560 new, -650 removed Medium

Questions

  1. Should the Anthropic models/functions be exposed in the library's public API (via __init__.py), or kept internal?
  2. Do you want me to implement this refactoring, or just provide this plan for FammasMaz or yourself to implement?

Let me know how you'd like to proceed!

This analysis was generated by an AI assistant.

@Mirrowel
Copy link
Owner

Mirrowel commented Dec 19, 2025

@mirrobot-agent implement this, and open a separate PR with that against this PR. Make sure it is comprehensive and integrated with any provider library and proxy work with. Maintain opeanai- compatible format internally, and convert to and from it for providers, so they can essentially be unchanged. Try to do this iteratively, so make a plan and go along with it, committing as you go. You are not allowed to commit to this PR - only the new one you will open.

@mirrobot-agent
Copy link

@Mirrowel, I'm on it! I'll implement the refactoring plan to move the Anthropic translation layer from main.py into the rotator_library as a proper, reusable module.

My Plan

I'll work iteratively, committing as I go:

  1. Create new branch from feature/anthropic-endpoints
  2. Create anthropic_compat/models.py - Extract all Pydantic models
  3. Create anthropic_compat/translator.py - Extract format translation functions
  4. Create anthropic_compat/streaming.py - Extract streaming wrapper (framework-agnostic)
  5. Update client.py - Add anthropic_messages() and anthropic_count_tokens() methods
  6. Update library exports - Make new module accessible
  7. Simplify main.py - Replace ~700 lines with calls to library
  8. Open PR against this branch

Starting now... 🚀

This task is being handled by an AI assistant.

@mirrobot-agent
Copy link

Refactoring PR Created

I've created PR #47 which refactors the Anthropic translation layer from this PR into the rotator_library as a proper, reusable module.

Key Changes in #47:

  • New rotator_library/anthropic_compat/ module with:

    • Pydantic models for Anthropic API
    • Format translation functions (Anthropic ↔ OpenAI)
    • Framework-agnostic streaming wrapper
  • New RotatingClient methods:

    • anthropic_messages() - Handle Anthropic Messages API
    • anthropic_count_tokens() - Handle token counting
  • Simplified main.py: Removed ~663 lines of local code, now uses library components

Benefits:

  1. Reusability for other applications using rotator_library
  2. Clear separation between library and application code
  3. Independent unit testability
  4. Consistent with existing library architecture

See: #47

…atibility

- Add /v1/messages endpoint with Anthropic-format request/response
- Support both x-api-key and Bearer token authentication
- Implement Anthropic <-> OpenAI format translation for messages, tools, and responses
- Add streaming wrapper converting OpenAI SSE to Anthropic SSE events
- Handle tool_use blocks with proper stop_reason detection
- Fix NoneType iteration bug in tool_calls handling
- Add AnthropicThinkingConfig model and thinking parameter to request
- Translate Anthropic thinking config to reasoning_effort for providers
- Handle reasoning_content in streaming wrapper (thinking_delta events)
- Convert reasoning_content to thinking blocks in non-streaming responses
When no thinking config is provided in the request, Opus models now
automatically use reasoning_effort=high with custom_reasoning_budget=True.

This ensures Opus 4.5 uses the full 32768 token thinking budget instead
of the backend's auto mode (thinkingBudget: -1) which may use less.

Opus always uses the -thinking variant regardless, but this change
guarantees maximum thinking capacity for better reasoning quality.
…ling

- Add validation to ensure maxOutputTokens > thinkingBudget for Claude
  extended thinking (prevents 400 INVALID_ARGUMENT API errors)
- Improve streaming error handling to send proper message_start and
  content blocks before error event for better client compatibility
- Minor code formatting improvements
Track each tool_use block index separately and emit content_block_stop
for all blocks (thinking, text, and each tool_use) when stream ends.
Fixes Claude Code stopping mid-action due to malformed streaming events.
…nabled

- Fixed bug where budget_tokens between 10000-32000 would get ÷4 reduction
- Now any explicit thinking request sets custom_reasoning_budget=True
- Added logging to show thinking budget, effort level, and custom_budget flag
- Simplified budget tier logic (removed redundant >= 32000 check)

Before: 31999 tokens requested → 8192 tokens actual (÷4 applied)
After:  31999 tokens requested → 32768 tokens actual (full "high" budget)
When using /v1/chat/completions with Opus and reasoning_effort="high" or
"medium", automatically set custom_reasoning_budget=true to get full
thinking tokens instead of the ÷4 reduced default.

This makes the OpenAI endpoint behave consistently with the Anthropic
endpoint for Opus models - if you're using Opus with high reasoning,
you want the full thinking budget.

Adds logging: "🧠 Thinking: auto-enabled custom_reasoning_budget for Opus"
…treaming

Claude Code and other Anthropic SDK clients require message_start to be
sent before any other SSE events. When a stream completed quickly without
content chunks, the wrapper would send message_stop without message_start,
causing clients to silently discard all output.
Signed-off-by: Moeeze Hassan <fammas.maz@gmail.com>
- Create rotator_library/anthropic_compat module with models, translator, and streaming
- Add anthropic_messages() and anthropic_count_tokens() methods to RotatingClient
- Simplify main.py endpoints to use library methods
- Remove ~762 lines of duplicate code from main.py
- Fix: Use UUID instead of time.time() for tool/message IDs (avoids collisions)
- Fix: Remove unused accumulated_text/accumulated_thinking variables
- Fix: Map top_k parameter from Anthropic to OpenAI format
- Add comment explaining metadata parameter is intentionally not mapped
  (OpenAI doesn't have an equivalent field)
- Use safer regex pattern matching for Opus model detection
  (avoids false positives like "magnum-opus-model")
- Document reasoning budget thresholds and // 4 reduction behavior
- Conserve thinking tokens for Opus auto-detection (use // 4 like other models)
  Only set custom_reasoning_budget=True when user explicitly requests 32000+ tokens
@FammasMaz FammasMaz force-pushed the feature/anthropic-endpoints branch from aef492b to d91f98b Compare December 20, 2025 21:35
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

Previous Feedback Status

Both suggestions from the first review have been addressed:

  1. Tool ID collision risk: The fallback tool ID generation now uses uuid.uuid4().hex[:12] instead of int(time.time()) (translator.py:253). This eliminates the collision risk for parallel requests.

  2. Unused state variables: The accumulated_text and accumulated_thinking variables have been removed from the streaming wrapper. The code now only tracks the variables it actually uses.

New Changes Review

This batch of commits includes several well-implemented features:

Anthropic Compat Module Refactoring:

  • Clean extraction of translation logic into rotator_library/anthropic_compat/
  • Proper separation: models.py (Pydantic), translator.py (format conversion), streaming.py (SSE wrapper)
  • The refactor significantly improves maintainability by reducing main.py complexity

Quota Tracking System:

  • AntigravityQuotaTracker mixin provides API-based quota baseline tracking
  • Background job system allows provider-specific periodic tasks
  • TransientQuotaError correctly handles bare 429s (without retry info) as transient errors, enabling internal retries before credential rotation

Model Support Expansion:

  • New models: gemini-2.5-flash, gemini-2.5-flash-lite, gemini-3-flash, gpt-oss-120b-medium
  • Updated quota groups reflect empirical quota sharing behavior

Minor Observations

  • Token counting approximation (from first review): Still uses JSON serialization for tools. This is acknowledged in the implementation and is a reasonable trade-off.

  • Model Filter GUI & Quota Viewer TUI: Large additions (~5K lines combined) but contained to their own files. The virtual canvas implementation for performance is a good choice for large model lists.

Overall, this is a solid incremental update that improves the codebase structure while adding valuable functionality.

This review was generated by an AI assistant.

@FammasMaz
Copy link
Author

Maybe this is better now...was a rebase

Tool results with images (e.g., from Read tool) were being dropped during
Anthropic→OpenAI translation, and not properly converted to Gemini format.

- translator.py: Extract image blocks from tool_result content and convert
  to OpenAI image_url format
- antigravity_provider.py: Handle multimodal tool responses by converting
  image_url to Gemini inlineData format
Copy link

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

This incremental update adds multimodal support for tool responses across the Anthropic translation layer and the Antigravity provider. The changes enable images from tool results to be properly converted between API formats.

Assessment of New Changes:

The implementation is well-structured:

  • translator.py: Now handles Anthropic tool_result blocks that contain mixed text and images, converting them to OpenAI's multimodal format. The code properly handles all cases: string content, list of text blocks, list with images, and empty content.

  • antigravity_provider.py: The _transform_tool_message method now converts OpenAI's image_url format to Gemini's inlineData format. The docstring update clearly documents the new behavior.

Minor Suggestion:
One inline comment regarding logging for debugging when non-data URLs are encountered.

Overall Status:
The changes are solid and ready to merge. The multimodal handling completes the tool response flow across the translation layers.

This review was generated by an AI assistant.

Comment on lines +2475 to +2490
elif item_type == "image_url":
# Convert OpenAI image_url format to Gemini inlineData
image_url = item.get("image_url", {}).get("url", "")
if image_url.startswith("data:"):
try:
# Parse: data:image/png;base64,iVBORw0KG...
header, data = image_url.split(",", 1)
mime_type = header.split(":")[1].split(";")[0]
image_parts.append({
"inlineData": {
"mimeType": mime_type,
"data": data,
}
})
except Exception as e:
lib_logger.warning(f"Failed to parse image data URL in tool response: {e}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider logging a warning when URL-based images are encountered but cannot be converted. Currently, only data: URLs are parsed; external URLs (e.g., https://...) are silently skipped. While this may be intentional since external URLs cannot be converted to inlineData without fetching them, a debug/warning log would help when debugging image handling issues.

Suggested change
elif item_type == "image_url":
# Convert OpenAI image_url format to Gemini inlineData
image_url = item.get("image_url", {}).get("url", "")
if image_url.startswith("data:"):
try:
# Parse: data:image/png;base64,iVBORw0KG...
header, data = image_url.split(",", 1)
mime_type = header.split(":")[1].split(";")[0]
image_parts.append({
"inlineData": {
"mimeType": mime_type,
"data": data,
}
})
except Exception as e:
lib_logger.warning(f"Failed to parse image data URL in tool response: {e}")
if item_type == "text":
text_parts.append(item.get("text", ""))
elif item_type == "image_url":
# Convert OpenAI image_url format to Gemini inlineData
image_url = item.get("image_url", {}).get("url", "")
if image_url.startswith("data:"):
try:
# Parse: data:image/png;base64,iVBORw0KG...
header, data = image_url.split(",", 1)
mime_type = header.split(":")[1].split(";")[0]
image_parts.append({
"inlineData": {
"mimeType": mime_type,
"data": data,
}
})
except Exception as e:
lib_logger.warning(f"Failed to parse image data URL in tool response: {e}")
else:
lib_logger.debug(f"Skipping non-data URL image in tool response (cannot convert to inlineData): {image_url[:50]}...")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Agent Monitored Monitored for AI Agent to review PR's and commits enhancement New feature or request Priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants