Skip to content

fix(executor): reuse HTTP client and implement global retry#536

Open
dumko2001 wants to merge 8 commits intogoogleworkspace:mainfrom
dumko2001:fix/engine-perf-retry-v2
Open

fix(executor): reuse HTTP client and implement global retry#536
dumko2001 wants to merge 8 commits intogoogleworkspace:mainfrom
dumko2001:fix/engine-perf-retry-v2

Conversation

@dumko2001
Copy link
Contributor

@dumko2001 dumko2001 commented Mar 18, 2026

Description

This PR implements critical engine improvements for performance and reliability:

  1. HTTP Client Reuse: Moves reqwest::Client creation outside loops to enable TCP/TLS pooling across paginated requests (saving ~300ms/page).
  2. Global Retry Logic: Wraps all API calls in send_with_retry to handle 429 TOO_MANY_REQUESTS, transient server errors (500, 502, 503), and transient network errors (timeouts and connection resets).
  3. Async Robustness: Refactored build_http_request to pre-resolve file metadata asynchronously, avoiding blocking I/O in the async execution path.
  4. Panic Prevention: Replaced all .expect() calls in retry closures with proper Result propagation.

Checklist:

  • My code follows the AGENTS.md guidelines (no generated google-* crates).
  • I have run cargo fmt --all to format the code perfectly.
  • I have run cargo clippy -- -D warnings and resolved all warnings.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have provided a Changeset file (e.g. via pnpx changeset) to document my changes.

@dumko2001 dumko2001 requested a review from jpoehnelt as a code owner March 18, 2026 08:45
@changeset-bot
Copy link

changeset-bot bot commented Mar 18, 2026

🦋 Changeset detected

Latest commit: d4168b4

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the engine's performance and resilience by optimizing HTTP client usage and implementing robust error handling. It introduces connection pooling for paginated requests to reduce latency and integrates a global retry mechanism for rate limits and transient server errors, ensuring more reliable API interactions.

Highlights

  • HTTP Client Reuse: The reqwest::Client is now created once outside the pagination loop in executor.rs, enabling TCP/TLS connection pooling and significantly reducing overhead for paginated requests.
  • Global Retry Logic: All API calls are now wrapped in a send_with_retry function, providing automatic handling for 429 TOO_MANY_REQUESTS and transient server errors (500, 502, 503) to improve request resilience.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Generative AI Prohibited Use Policy, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces global retry logic for HTTP requests, extending the send_with_retry function in client.rs to handle 429 (Too Many Requests) and transient 5xx status codes (500, 502, 503). It also optimizes performance by reusing the HTTP client across paginated requests in executor.rs and refactors the build_http_request function to be synchronous. A critical review comment highlights a potential issue where the build_http_request call within the send_with_retry closure uses .expect(), which could lead to application panics if request building fails, suggesting a more robust error handling approach.

@googleworkspace-bot googleworkspace-bot added the area: core Core CLI parsing, commands, error handling, utilities label Mar 18, 2026
@googleworkspace-bot
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces valuable improvements by reusing the HTTP client for better performance and adding global retry logic for increased resilience. The changes in executor.rs and the helper modules are well-implemented.

I've confirmed the critical issue in the new retry logic in src/client.rs where the final retry attempt is unreachable, which undermines the goal of increased resilience. My review includes a specific suggestion to fix this.

@googleworkspace-bot
Copy link
Collaborator

/gemini review

@dumko2001
Copy link
Contributor Author

Updated to handle fallible request builders in send_with_retry and removed .expect() from executor.rs as suggested. Also corrected the retry logic to ensure the final attempt is reached, maintaining the intended MAX_RETRIES + 1 total attempts.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces two significant improvements: reusing the HTTP client for performance and adding a global retry mechanism for resilience. The changes are well-structured and achieve the stated goals. I've identified a critical issue with a blocking I/O call in an async context and a high-severity issue where the new retry logic doesn't handle transient network errors. My review comments provide detailed explanations and suggestions for fixing these issues, including an acknowledgment of potential TOCTOU race conditions in file path validation.

@googleworkspace-bot
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the HTTP request retry mechanism by extending the send_with_retry function in src/client.rs to handle transient 5xx errors (500, 502, 503) alongside existing 429 rate limit retries. It also optimizes HTTP client usage in src/executor.rs by reusing a single reqwest::Client instance across paginated requests instead of creating a new one for each iteration. The build_http_request function was refactored to be synchronous, returning a RequestBuilder directly, and its usage, along with other helper functions, was updated to integrate with the enhanced retry logic.

@googleworkspace-bot
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces two significant improvements to the executor: reusing the HTTP client across paginated requests for better performance, and implementing a global retry mechanism for increased resilience against transient network and server errors. The implementation is well-structured, moving I/O operations out of loops and centralizing retry logic. However, I've identified a logic issue in the new retry mechanism in src/client.rs where network errors are not handled as robustly as server errors, which could impact the intended resilience improvements.

src/client.rs Outdated

let resp = match req_result {
Ok(r) => r,
Err(e) if attempt < MAX_RETRIES - 1 && (e.is_timeout() || e.is_connect()) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The retry logic for network errors is inconsistent with server errors. A network error on the last loop iteration (attempt == MAX_RETRIES - 1) will fail immediately, while a server error in the same situation will lead to one final retry attempt after the loop. This reduces the resilience to transient network issues. To make the behavior consistent, the attempt < MAX_RETRIES - 1 check should be removed.

Suggested change
Err(e) if attempt < MAX_RETRIES - 1 && (e.is_timeout() || e.is_connect()) => {
Err(e) if e.is_timeout() || e.is_connect() => {

@dumko2001
Copy link
Contributor Author

Refactored send_with_retry to be more consistent and robust. Now both network errors (timeouts/connect) and server errors (429/5xx) receive exactly MAX_RETRIES + 1 total attempts with identical backoff logic.

@googleworkspace-bot
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant improvements to the engine's performance and reliability. Key changes include reusing the reqwest::Client across paginated requests to leverage connection pooling, and implementing a robust global retry mechanism in send_with_retry that handles transient network errors and specific 5xx/429 server responses. The logic for file uploads has also been refactored to pre-resolve file metadata, making build_http_request synchronous and avoiding I/O within the request-building closure. The changes are well-implemented and substantially enhance the application's resilience and efficiency. I have reviewed the code and found no issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: core Core CLI parsing, commands, error handling, utilities area: http

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants