Add OAuth retry logic and improve error handling in setup flow#32
Conversation
- oauth_setup.py: add 2-attempt PKCE retry loop so expired authorization codes (single-use OpenRouter limitation) trigger a fresh auth URL and browser open rather than a hard failure. Also note port 3000 is hardcoded by OpenRouter in the port-conflict error message. - setup.md: replace bare `timeout 240s` with cross-platform wrapper that falls back to `gtimeout` (macOS/Homebrew) then `timeout` (Linux) then runs without a wrapper — the script's internal 180s timeout is sufficient. - set-key.md: add /grok-swarm:set-key last-resort command that saves an API key to ~/.config/grok-swarm/config.json with chmod 600, but REQUIRES explicit user confirmation after displaying a security warning about key visibility in the LLM context window. Closes #29 https://claude.ai/code/session_018yDmuebxUwxQNmxFUgJ3Gz
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 0 minutes and 18 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (13)
WalkthroughAdds a manual Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant oauth_setup as oauth_setup.py
participant CallbackSvr as Callback Server
participant Browser as Browser
participant OAuthProvider as OAuth Provider
User->>oauth_setup: run_oauth_flow()
Note over oauth_setup: Attempt 1 of 2
oauth_setup->>oauth_setup: Generate PKCE pair
oauth_setup->>CallbackSvr: _start_callback_server()<br/>(retry up to 3x on bind)
alt Bind succeeds
CallbackSvr-->>oauth_setup: HTTPServer
else Bind fails after retries
CallbackSvr-->>oauth_setup: None (hard failure)
oauth_setup-->>User: Error & exit
end
oauth_setup->>Browser: Open auth URL
Browser->>OAuthProvider: Request authorization
OAuthProvider->>Browser: Redirect to callback
Browser->>CallbackSvr: GET /callback?code=...
CallbackSvr->>oauth_setup: _received_code set
oauth_setup->>OAuthProvider: Exchange code for token<br/>using PKCE
alt Token exchange succeeds
OAuthProvider-->>oauth_setup: Access token
oauth_setup-->>User: Success ✓
else RuntimeError (single-use/expired code)
oauth_setup-->>oauth_setup: Attempt 2 of 2
Note over oauth_setup: Generate NEW PKCE,<br/>show "new auth link"
oauth_setup->>Browser: Open NEW auth URL
Browser->>OAuthProvider: New authorization request
OAuthProvider->>Browser: New redirect
Browser->>CallbackSvr: GET /callback?code=...
CallbackSvr->>oauth_setup: _received_code updated
oauth_setup->>OAuthProvider: Exchange with NEW PKCE
OAuthProvider-->>oauth_setup: Access token
oauth_setup-->>User: Success ✓
else Other failure
oauth_setup-->>User: Error & exit
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
platforms/claude/commands/set-key.md (1)
39-41: Add a language tag to this fenced block.
markdownlintis already flagging this fence, so the doc will keep carrying a warning until the snippet is tagged astext.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@platforms/claude/commands/set-key.md` around lines 39 - 41, The fenced code block containing the command "/grok-swarm:set-key YOUR_API_KEY_HERE" is missing a language tag which triggers markdownlint; edit the fenced block in set-key.md to add the `text` tag after the opening backticks (i.e., change "```" to "```text") so the snippet is recognized as plain text and the markdownlint warning is resolved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@platforms/claude/commands/set-key.md`:
- Around line 48-60: The current snippet interpolates the raw argument into the
shell and Python source (API_KEY and the literal assignment to
existing['api_key']), which allows command substitution or quote-break attacks;
instead pass the key as data (e.g., export API_KEY into the environment or pass
it as a Python command-line argument) and read it inside Python via os.environ
or sys.argv so you do not embed the key inside Python source text; update the
block that sets API_KEY and the Python runner that writes to
config_file/existing['api_key'] to accept and use the key from environment or
argv and write it as raw data (no string interpolation into the Python code).
In `@platforms/claude/commands/setup.md`:
- Around line 58-65: Update the explanatory note to reflect that TIMEOUT_CMD
(which tries gtimeout then timeout) provides a best-effort 240s wrapper for
running oauth_setup.py, and that if neither gtimeout nor timeout is present
(e.g., stock macOS) the script falls back to the internal 180s timeout in
src/bridge/oauth_setup.py rather than enforcing 240s; reference the TIMEOUT_CMD
variable, the gtimeout/timeout fallbacks, the "240s" wrapper, and oauth_setup.py
so readers understand the actual timeout contract.
In `@src/bridge/oauth_setup.py`:
- Around line 306-315: The current retry logic treats every RuntimeError from
_exchange_code() as an expired PKCE code and retries; change this so only
genuinely retryable "invalid/expired authorization code" errors trigger a retry:
update _exchange_code to raise a distinct error (e.g.,
InvalidAuthorizationCodeError) or attach a retryable flag/message to the
RuntimeError, then in the try/except block inspect the exception (type or a
retryable attribute or the OAuth error field like
"invalid_grant"/"expired_token") and only perform the print/continue when that
indicates the code is expired/one-time-use; for any other RuntimeError (network,
5xx, JSON parse, malformed payload) re-raise or fail immediately so the user is
not forced into an unnecessary re-auth flow (refer to _exchange_code, the except
RuntimeError as exc block, and variables attempt/max_attempts).
- Around line 275-289: The code opens the browser before binding the local
callback HTTPServer, which can race with the OAuth redirect; modify the flow in
oauth_setup.py so you call _start_callback_server() first (and handle server is
None as currently done), only after a successfully bound server call
_open_browser(auth_url) and then wait for the callback (preserving the existing
timeout logic using OAUTH_TIMEOUT_SECS and the existing error/exit behavior
referencing CALLBACK_PORT); ensure you still print the "Attempting to open your
browser..." message immediately before calling _open_browser so user feedback
remains correct.
---
Nitpick comments:
In `@platforms/claude/commands/set-key.md`:
- Around line 39-41: The fenced code block containing the command
"/grok-swarm:set-key YOUR_API_KEY_HERE" is missing a language tag which triggers
markdownlint; edit the fenced block in set-key.md to add the `text` tag after
the opening backticks (i.e., change "```" to "```text") so the snippet is
recognized as plain text and the markdownlint warning is resolved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ef69365b-2e03-46dc-96d8-de19fbd6c450
📒 Files selected for processing (3)
platforms/claude/commands/set-key.mdplatforms/claude/commands/setup.mdsrc/bridge/oauth_setup.py
Add a stdio-based MCP server (JSON-RPC 2.0) that exposes Grok 4.20 multi-agent as native tools for Claude Code. This replaces the bash subprocess model with proper tool-calling semantics. New tools: - grok_query: stateless single call (replaces CLI subprocess pattern) - grok_session_start: begin a multi-turn conversation with Grok - grok_session_continue: continue an existing session with history - grok_agent: run the autonomous agent loop Architecture: - src/mcp/grok_server.py: MCP protocol layer + tool dispatch - src/mcp/session.py: GrokSession class with in-memory session store, TTL expiration, token budget tracking, and conversation history - src/bridge/grok_bridge.py: new call_grok_with_messages() that accepts pre-built message lists (used by sessions); call_grok() refactored to delegate to it for backward compat Install: claude mcp add grok-swarm -- python3 src/mcp/grok_server.py Addresses #30 (stateful sessions) and lays foundation for #31 (Grok sub-agent tools will be added in Phase 3 follow-up). https://claude.ai/code/session_018yDmuebxUwxQNmxFUgJ3Gz
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (3)
src/mcp/grok_server.py (2)
375-381: Lazy import has unused bindings.The
_get_session_module()function imports_sessions,create_session, andget_sessionbut never uses them—only the module object is retained. This works but the explicit imports are misleading.🧹 Simplify the import
def _get_session_module(): global _session_module if _session_module is None: - from mcp.session import _sessions, create_session, get_session import mcp.session as mod _session_module = mod return _session_module🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mcp/grok_server.py` around lines 375 - 381, The function _get_session_module currently does a lazy import but pulls in unused bindings (_sessions, create_session, get_session) which is misleading; update the function to only import the module object (e.g., import mcp.session as mod or use importlib.import_module('mcp.session')) and assign it to the global _session_module, leaving the rest of the logic unchanged so callers still use _session_module.
441-450: Consider a more robust error detection pattern.The current approach detects errors by checking if the response text starts with
"ERROR:". This works, but it's a bit like checking if someone's angry by looking for exclamation marks—it could lead to false positives if legitimate content happens to start with "ERROR:".A cleaner approach would be for
_error_content()to return a distinct marker or for handlers to return a tuple indicating success/failure.💡 Optional: Return structured result from handlers
-def _error_content(message): - """Wrap error in MCP tool result content format (isError=True).""" - return [{"type": "text", "text": f"ERROR: {message}"}] +def _error_content(message): + """Wrap error in MCP tool result content format (isError=True).""" + return ([{"type": "text", "text": message}], True) # (content, is_error) + +def _text_content(text): + """Wrap text in MCP tool result content format.""" + return ([{"type": "text", "text": text}], False) # (content, is_error)Then update handlers to return the tuple and
_handle_tools_callto unpack it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mcp/grok_server.py` around lines 441 - 450, The current error detection in _handle_tools_call relies on checking if handler(tool_args) returns content whose text starts with "ERROR:", which is fragile; change handlers to return a structured result (e.g., (success_bool, content) or a dict with an explicit error flag) or make _error_content() return a distinct sentinel object, then update _handle_tools_call to unpack and inspect that explicit flag/sentinel instead of string-matching; locate calls to handler(tool_args), the _error_content() helper, and the result construction (result = {"content": content}) and modify them so handlers return (ok, content) or {"isError": True, "content": ...} and _handle_tools_call uses that explicit field to set result["isError"].src/mcp/session.py (1)
108-116: Budget warning placement may be suboptimal.The budget warning is appended as a system message after the user messages. Some models weight earlier system messages more heavily than later ones. Consider placing this warning immediately after the initial system prompt for more reliable attention.
💡 Alternative placement
# Build full message list for API api_messages = [{"role": "system", "content": self.system_prompt}] + + # Inject budget warning if nearly exhausted (early for visibility) + if self.total_tokens > self.max_tokens * 0.8: + api_messages.append({ + "role": "system", + "content": ( + "NOTE: Token budget for this session is nearly exhausted. " + "Please wrap up your response concisely." + ), + }) + api_messages.extend(self.messages) - - # Inject budget warning if nearly exhausted - if self.total_tokens > self.max_tokens * 0.8: - api_messages.append({ - "role": "system", - "content": ( - "NOTE: Token budget for this session is nearly exhausted. " - "Please wrap up your response concisely." - ), - })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mcp/session.py` around lines 108 - 116, The budget warning is currently appended at the end of api_messages (after user messages) which can reduce its influence; change the logic in session.py so when self.total_tokens > self.max_tokens * 0.8 you insert the warning immediately after the initial system prompt instead of appending it—e.g., locate where api_messages is built (the code that adds the initial system prompt and user entries) and insert the warning at index 1 or right after the first system message entry (use api_messages.insert(1, {...}) or find the first message with role == "system" and insert after it) so the warning appears near the top of the conversation. Ensure you continue to use the same message shape ("role": "system", "content": ...).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@package.json`:
- Around line 31-32: The package.json currently exposes "grok-swarm-mcp"
pointing directly at a Python file and uses ./scripts/setup.sh in postinstall,
which breaks on Windows; replace the direct Python bin with a Node wrapper
(e.g., add bin/grok-swarm-mcp.js) that spawns the Python interpreter (try
"python3" then "python") to run src/mcp/grok_server.py, update the
"grok-swarm-mcp" bin entry to point to that JS wrapper, and modify the
postinstall script to detect process.platform (win32 vs others) and run the
appropriate script/command (invoke scripts/setup.sh on Unix, a .cmd or
PowerShell script on Windows) or add a .cmd shim as an interim fallback so
Windows users can run the CLI and install steps without manual intervention.
In `@src/bridge/grok_bridge.py`:
- Around line 383-387: The current try/except around the usage tracking silently
swallows all exceptions; change it to catch Exception as e and log the failure
so issues are visible (e.g., using logging.exception or printing to sys.stderr)
when importing usage_tracker or calling record_usage(mode, thinking,
u.prompt_tokens, u.completion_tokens, u.total_tokens, elapsed) fails; keep it
best-effort (do not re-raise) but ensure the exception and context (mode,
thinking, token counts, elapsed) are included in the log message for debugging.
In `@src/mcp/session.py`:
- Around line 125-128: The current code in Session is accumulating
response.usage.total_tokens into self.total_tokens which double-counts prompt
history across turns if your intent is to measure unique conversation size;
decide which metric you want: if you want API cost keep the existing addition of
response.usage.total_tokens (no change), otherwise change the accounting to only
add response.usage.completion_tokens (or compute delta by storing the last
prompt token count e.g., self.last_prompt_tokens and add
response.usage.total_tokens - self.last_prompt_tokens) and update any references
to self.total_tokens accordingly; look for the symbols
response.usage.total_tokens, response.usage.completion_tokens, self.total_tokens
(and consider adding self.last_prompt_tokens) inside the Session class in
src/mcp/session.py to implement the chosen fix.
---
Nitpick comments:
In `@src/mcp/grok_server.py`:
- Around line 375-381: The function _get_session_module currently does a lazy
import but pulls in unused bindings (_sessions, create_session, get_session)
which is misleading; update the function to only import the module object (e.g.,
import mcp.session as mod or use importlib.import_module('mcp.session')) and
assign it to the global _session_module, leaving the rest of the logic unchanged
so callers still use _session_module.
- Around line 441-450: The current error detection in _handle_tools_call relies
on checking if handler(tool_args) returns content whose text starts with
"ERROR:", which is fragile; change handlers to return a structured result (e.g.,
(success_bool, content) or a dict with an explicit error flag) or make
_error_content() return a distinct sentinel object, then update
_handle_tools_call to unpack and inspect that explicit flag/sentinel instead of
string-matching; locate calls to handler(tool_args), the _error_content()
helper, and the result construction (result = {"content": content}) and modify
them so handlers return (ok, content) or {"isError": True, "content": ...} and
_handle_tools_call uses that explicit field to set result["isError"].
In `@src/mcp/session.py`:
- Around line 108-116: The budget warning is currently appended at the end of
api_messages (after user messages) which can reduce its influence; change the
logic in session.py so when self.total_tokens > self.max_tokens * 0.8 you insert
the warning immediately after the initial system prompt instead of appending
it—e.g., locate where api_messages is built (the code that adds the initial
system prompt and user entries) and insert the warning at index 1 or right after
the first system message entry (use api_messages.insert(1, {...}) or find the
first message with role == "system" and insert after it) so the warning appears
near the top of the conversation. Ensure you continue to use the same message
shape ("role": "system", "content": ...).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f2ce8c58-899e-45a7-938b-1cc463a20390
📒 Files selected for processing (6)
package.jsonsrc/bridge/grok_bridge.pysrc/mcp/__init__.pysrc/mcp/__main__.pysrc/mcp/grok_server.pysrc/mcp/session.py
✅ Files skipped from review due to trivial changes (1)
- src/mcp/init.py
- set-key.md: pass API key via GROK_SET_KEY env var instead of interpolating into shell/Python source (prevents injection) - setup.md: update timeout note to reflect best-effort behavior when neither gtimeout nor timeout is available - oauth_setup.py: bind callback server BEFORE opening browser to prevent race condition on fast redirects; add _InvalidCodeError subclass so only HTTP 400 (expired PKCE code) triggers retry — network errors and 5xx fail immediately; add urllib.error import - grok_bridge.py: log usage tracking failures to stderr instead of silently swallowing exceptions - grok_server.py: use tuple returns (content, is_error) from helpers instead of string-matching "ERROR:" prefix; remove unused lazy import bindings - session.py: move budget warning before conversation messages for better model attention; add comment clarifying token accounting tracks API billing cost (intentionally includes re-sent history) https://claude.ai/code/session_018yDmuebxUwxQNmxFUgJ3Gz
- SKILL_AGENT.md: replace old subprocess architecture diagram with MCP server flow; update invocation pattern to show native MCP tool calls as preferred method; replace "STATELESS" limitation with multi-turn session documentation; remove "NO Native Agent Pool" limitation - SKILL.md: add MCP Tools section documenting grok_query, grok_session_start, grok_session_continue, grok_agent - README.md: add "Native MCP Server" and "Multi-Turn Sessions" to features; replace single architecture diagram with separate Claude Code (MCP) and OpenClaw (Plugin) diagrams - setup.sh: auto-register MCP server via `claude mcp add` during plugin installation so users get native tools without manual config https://claude.ai/code/session_018yDmuebxUwxQNmxFUgJ3Gz
- Create bin/grok-swarm-mcp.js Node wrapper that spawns python3 (or python on Windows) to run grok_server.py, forwarding stdio for the MCP transport - Update package.json bin entry to point to the JS wrapper instead of the Python file directly (shebangs don't work on Windows) - Add bin/ to package.json files array for npm packaging - Update postinstall.js to show platform-appropriate setup commands on Windows vs Unix Fixes #33 https://claude.ai/code/session_018yDmuebxUwxQNmxFUgJ3Gz
Summary
Enhanced the OAuth setup flow to handle transient failures gracefully by retrying with a fresh PKCE pair if the authorization code exchange fails. Also improved error messages and added a manual fallback command for setting API keys.
Key Changes
_start_callback_server()function to reduce code duplication and improve testabilityrun_oauth_flow()that attempts the OAuth flow up to 2 times, generating a new PKCE pair on each attempt. This handles the case where an authorization code expires or is already consumed (PKCE codes are single-use)gtimeouton macOS (from Homebrew coreutils) or fall back totimeouton Linux, with graceful degradation if neither is availableset-keycommand (platforms/claude/commands/set-key.md) as a last-resort manual fallback for setting API keys when OAuth fails, with prominent security warnings about key visibility in conversation contextImplementation Details
_received_codeat the start of each attempt to ensure clean statetimemodule is now imported at the function level for cleaner code organizationset-keycommand uses secure file operations (atomic write with 0o600 permissions) to save the API keyhttps://claude.ai/code/session_018yDmuebxUwxQNmxFUgJ3Gz
Summary by CodeRabbit
New Features
Improvements