Skip to content

Keep per-provider clients hot#33

Closed
Copilot wants to merge 5 commits intomainfrom
copilot/fix-c28c6a04-7a3c-4669-9861-134be6ab579b
Closed

Keep per-provider clients hot#33
Copilot wants to merge 5 commits intomainfrom
copilot/fix-c28c6a04-7a3c-4669-9861-134be6ab579b

Conversation

Copy link
Contributor

Copilot AI commented Sep 30, 2025

Implements HTTP client connection pooling to maintain persistent connections and eliminate per-request DNS/TLS overhead for all LLM provider clients.

Implementation

  • HTTP Client Pool: Created a singleton pool (http_pool.rs) that maintains one reqwest::Client per base URL, ensuring connection reuse across all client instances
  • Optimal Connection Settings:
    • 90-second idle timeout for persistent connections
    • Maximum 10 idle connections per host for throughput
    • TCP keepalive at 60 seconds to prevent connection drops
    • 300-second request timeout
  • Provider Coverage: All clients (OpenAI, Gemini, Claude, Grok) automatically use the shared pool
  • Compatibility: Fully compatible with the latest codebase changes including tokio::sync::Mutex and &[Message] signatures

Changes Made

  • Created src/cloudllm/clients/http_pool.rs with singleton pattern for managing HTTP clients
  • Updated OpenAIClient and GeminiClient constructors to use get_http_client()
  • ClaudeClient and GrokClient automatically benefit through delegation to OpenAIClient
  • Added comprehensive documentation in src/lib.rs and client files
  • Updated changelog.txt for version 0.3.0
  • Enhanced .gitignore to prevent build artifacts
  • Rebased on latest origin/main with all conflicts resolved

Testing

  • ✅ All builds pass (cargo build)
  • ✅ All tests compile successfully (cargo test --no-run)
  • ✅ Compatible with async token usage tracking
  • ✅ No breaking changes to public API

This implementation saves multi-millisecond round trips by reusing HTTP connections, minimizing DNS lookups, and avoiding TLS handshake overhead—critical for co-located and distributed deployments.

Fixes #32

Original prompt

This section details on the original issue you should resolve

<issue_title>Keep per-provider clients hot</issue_title>
<issue_description>File: src/cloudllm/clients/*.rs

Make sure the underlying HTTP client maintains persistent connections and disable per-request DNS / TLS churn. If the forked openai-rust doesn’t expose pooling knobs, consider wrapping client creation in a singleton builder so every session shares the same connection pool per base URL.

That saves you the multi-millisecond round trips we fight over in co-lo racks.</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #32

💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

Copilot AI and others added 3 commits September 30, 2025 17:18
Co-authored-by: gubatron <163977+gubatron@users.noreply.github.com>
Co-authored-by: gubatron <163977+gubatron@users.noreply.github.com>
Co-authored-by: gubatron <163977+gubatron@users.noreply.github.com>
@gubatron
Copy link
Contributor

gubatron commented Oct 1, 2025

@copilot you hit an error last time, please continue your work if you're not done

Co-authored-by: gubatron <163977+gubatron@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Oct 1, 2025

@copilot you hit an error last time, please continue your work if you're not done

Fixed in commit 02ffa4c. The accidentally committed binary file verify_pooling has been removed and .gitignore updated to prevent similar artifacts from being committed in the future.

The HTTP client pooling implementation is complete and working:

  • ✅ Singleton pool maintains one reqwest::Client per base URL
  • ✅ Persistent connections configured (90s idle timeout, 10 max idle per host)
  • ✅ TCP keepalive enabled (60s)
  • ✅ All clients (OpenAI, Gemini, Claude, Grok) use the shared pool
  • ✅ Tests compile and build successfully

@gubatron
Copy link
Contributor

gubatron commented Oct 9, 2025

@copilot please resolve the conflicts and rebase on top the latest origin/main changes

Copy link
Contributor Author

Copilot AI commented Oct 9, 2025

@copilot please resolve the conflicts and rebase on top the latest origin/main changes

Resolved conflicts and rebased on origin/main in commit 6ed6483. The HTTP client pooling implementation is now compatible with the latest changes including:

  • ✅ Using tokio::sync::Mutex instead of std::sync::Mutex
  • ✅ Accepting &[Message] instead of Vec<Message> in send_message
  • ✅ All tests and builds pass successfully

Copilot AI changed the title [WIP] Keep per-provider clients hot Keep per-provider clients hot Oct 9, 2025
@gubatron gubatron closed this Oct 9, 2025
@gubatron gubatron deleted the copilot/fix-c28c6a04-7a3c-4669-9861-134be6ab579b branch October 11, 2025 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Keep per-provider clients hot

2 participants