Skip to content

perf(core): implement HTTP connection pooling for SPARQL endpoint#36

Merged
wiresio merged 4 commits intoeclipse-thingweb:mainfrom
kaiprodevops:perf/sparql-connection-pooling
Mar 30, 2026
Merged

perf(core): implement HTTP connection pooling for SPARQL endpoint#36
wiresio merged 4 commits intoeclipse-thingweb:mainfrom
kaiprodevops:perf/sparql-connection-pooling

Conversation

@kaiprodevops
Copy link
Copy Markdown
Contributor

@kaiprodevops kaiprodevops commented Mar 27, 2026

Description

This PR addresses a critical performance bottleneck in the core SPARQL communication layer (tdd/sparql.py) by replacing the localized HTTP context managers with a globally pooled HTTP client.

Changes Included:

  • Replaced the with httpx.Client() as client: blocks inside the query function with a persistent, module-level http_client instance.
  • Configured the global client with strict resource limits, timeouts, and environmental isolation.
  • Updated the inline TODO regarding SPARQL injection to clarify that sanitization is the architectural responsibility of upstream input validators, not the HTTP transport layer.

Motivation & Impact

While this refactoring is a prerequisite for the upcoming Model Context Protocol (MCP) integration—where AI Agents will execute complex reasoning loops involving multiple rapid, sequential queries to the graph database—it also delivers massive immediate benefits to the existing REST API.

Previously, every single CRUD operation via the REST API forced the application to establish a new TCP 3-way handshake and TLS negotiation. By leveraging HTTP Keep-Alive through connection pooling, we eliminate this overhead entirely for subsequent queries, drastically reducing the Time-to-First-Byte (TTFB) and overall system latency for all current users.

Security & Stability Considerations

To ensure that introducing a global, long-lived HTTP client does not expand our attack surface or introduce instability, the http_client has been strictly hardened:

  1. Resource Exhaustion Prevention: Configured httpx.Limits(max_keepalive_connections=50, max_connections=100) to ensure high-concurrency spikes do not drain server file descriptors.
  2. Anti-Hanging Mechanism: Added explicit Timeout(10.0, connect=5.0) to prevent stale connections from blocking application worker threads.
  3. Environment Isolation: Set trust_env=False to prevent potential Environment Variable Pollution (e.g., malicious HTTP_PROXY injection) from hijacking internal database traffic.
  4. SSRF Mitigation: Set follow_redirects=False to ensure the client strictly communicates with the known SPARQL endpoint and cannot be tricked into pivoting to other internal services.
  5. State Isolation: This client is strictly used for backend-to-backend database communication. No user-specific session state or cookies are shared across requests.

Testing Performed

  • Automated Tests: Executed the existing test suite via Docker (pytest tdd/tests/) -> All 41 tests passed successfully.
  • Manual/E2E Testing: Spun up the full docker-compose environment and verified that high-frequency data seeding (via scripts/import_all_plugfest.py) and basic REST API CRUD operations execute flawlessly without socket hangs or connection drops.

Replaced the localized 'with httpx.Client() as client:' blocks with a global module-level HTTP client. This eliminates the overhead of establishing a new TCP handshake and TLS negotiation for every database query, preparing the API for high-frequency MCP Agent workloads.
@wiresio
Copy link
Copy Markdown
Member

wiresio commented Mar 27, 2026

Thanks @kaiprodevops for addressing this issue. Connection pooling is a good idea, however Claude gave me some advice:

The new code creates a bare global client that is never explicitly closed.
This means:

  • Open sockets and connections are never properly released
  • On application shutdown, this can cause resource leaks or warnings
  • It relies entirely on garbage collection, which is not guaranteed

Fix: Register a shutdown hook or use a lifecycle manager:

import atexit

http_client = httpx.Client(
    limits=httpx.Limits(max_keepalive_connections=50, max_connections=100),
    timeout=httpx.Timeout(10.0, connect=5.0),
    trust_env=False,
    follow_redirects=False,
)

atexit.register(http_client.close)  # ← ensures cleanup on exit

trust_env=False
This disables reading proxy settings, SSL config, and environment variables like HTTP_PROXY, HTTPS_PROXY, SSL_CERT_FILE, etc.
Should be documented.

follow_redirects=False
This is actually httpx's default, so setting it explicitly is harmless. However, if the SPARQL endpoint ever issues a redirect (e.g., HTTP → HTTPS, or load balancer redirect), requests will silently fail with a 3xx response instead of following through.
Worth keeping, but should be documented as intentional.

What do you think?

@kaiprodevops
Copy link
Copy Markdown
Contributor Author

Thank you @wiresio for the excellent review and constructive feedback. I completely agree with the points raised and have updated the PR accordingly.

  1. Lifecycle Management (atexit): Excellent catch. Relying purely on garbage collection for unclosed sockets is an anti-pattern that can lead to resource leaks during worker restarts. I have implemented atexit.register(http_client.close) to ensure explicit cleanup on application shutdown.
  2. trust_env & follow_redirects: I have explicitly documented the rationale behind these settings in the code comments. Disabling them is an intentional security design to prevent Environment Variable Pollution (proxy hijacking) and SSRF vectors.
    • Regarding the 3xx redirect edge case: I've added a note in the documentation emphasizing that the TDD API and SPARQL endpoint should strictly communicate via internal networking, bypassing external load balancers. If infrastructure changes force a redirect in the future, failing safely with a 3xx status is indeed the preferred and more secure behavior.

I have just pushed the requested changes. Please let me know if everything looks good to merge!

@wiresio wiresio merged commit 2288563 into eclipse-thingweb:main Mar 30, 2026
3 checks passed
@wiresio
Copy link
Copy Markdown
Member

wiresio commented Mar 30, 2026

Thanks @kaiprodevops!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants