feat(proxy): support production multi-worker proxy startup + CLI/env tuning by Kayzo · Pull Request #279 · chopratejas/headroom

Kayzo · 2026-04-26T16:36:41Z

Summary

Adds first-class support for running the Headroom proxy with multiple Uvicorn workers through the official headroom proxy CLI and the Docker image, so Headroom can be used as a shared local proxy for multi-session / multi-subagent
coding-agent setups without needing a custom launcher.

This was requested in the issue that proposed CLI flags + env vars for workers and connection pool sizing on headroom proxy, and documented the workaround of running a separate ASGI launcher under python -m uvicorn.

Motivation

Running Headroom as a shared proxy in front of multiple agent sessions (Codex/OpenAI, Anthropic, Gemini) is CPU-bound during compression/tokenization. With a single worker, the proxy can bottleneck even when the host has many cores and RAM
available.

Before this PR:

headroom proxy offered no way to scale workers, concurrency, or connection pool.
Setting HEADROOM_WORKERS as an env var had no effect because the CLI did not read it.
Passing --workers N directly to python -m headroom.proxy.server failed because Uvicorn requires an import string to enable multiple workers, not an instantiated app.
Users had to write a custom ASGI module + override the Docker entrypoint to scale at all.

Changes

headroom/cli/proxy.py

New CLI flags + env vars on the headroom proxy command:
- --workers / HEADROOM_WORKERS (default 1)
- --limit-concurrency / HEADROOM_LIMIT_CONCURRENCY (default 1000)
- --max-connections / HEADROOM_MAX_CONNECTIONS (default 500)
- --max-keepalive / HEADROOM_MAX_KEEPALIVE (default 100)
CLI flags override env vars via Click precedence.
workers and limit_concurrency are forwarded to run_server(...) only when non-default, so default single-worker behavior is byte-identical to before.

headroom/proxy/server.py

When workers > 1, run_server(...) serializes the full ProxyConfig into an internal env var HEADROOM_PROXY_CONFIG_JSON and starts Uvicorn with:
- app_target = "headroom.proxy.server:create_app_from_env"
- factory=True
- workers=N, limit_concurrency=M, plus the existing proxy_headers=False.
When workers == 1, the old behavior is preserved: create_app(config) is called directly and passed as the app object.
New helpers:
- _json_ready(value) — recursive dataclass → JSON-safe conversion.
- _proxy_config_payload(config) — builds a serializable dict for all ProxyConfig fields. Verified to preserve all 92 fields.
- proxy_config_from_env() — reconstructs ProxyConfig from HEADROOM_PROXY_CONFIG_JSON; falls back to key HEADROOM* env vars for direct Uvicorn entrypoint usage.
- create_app_from_env() — public ASGI factory referenced by the import string.

This means in multi-worker mode, each Uvicorn worker process re-creates the exact same Headroom app with the exact same ProxyConfig used by the parent CLI process.

headroom/proxy/loopback_guard.py

Bugfix: is_loopback_host("::ffff:127.0.0.1") now returns True on Linux dual-stack sockets as its docstring and existing tests promised. Uses IPv6Address.ipv4_mapped.is_loopback explicitly instead of relying on IPv6Address.is_loopback,
which returns False for mapped literals.

Tests

tests/test_proxy_scalability.py::test_run_server_uses_import_string_for_multiple_workers:
- Verifies that with workers > 1, uvicorn.run receives the import string, factory=True, correct workers/limit_concurrency, and that HEADROOM_PROXY_CONFIG_JSON contains the expected config.
tests/test_cli_proxy_env.py::test_production_scaling_env_vars and
tests/test_cli_proxy_env.py::test_production_scaling_cli_flags_override_env_vars:
- Verify env → CLI → ProxyConfig + run_server(...) kwargs wiring and precedence.

Backwards compatibility

Single-worker path is unchanged; same create_app(config) + direct Uvicorn call.
run_server(...) signature preserved (workers=1, limit_concurrency=1000).
No route or response-shape changes.
No changes to request handlers, compression pipeline, memory, or backends.

Docker / production usage

  services:
    headroom-proxy:
      image: ghcr.io/chopratejas/headroom:latest
      environment:
        HEADROOM_WORKERS: "4"
        HEADROOM_MAX_CONNECTIONS: "200"
        HEADROOM_MAX_KEEPALIVE: "50"
        HEADROOM_LIMIT_CONCURRENCY: "250"
      ports:
        - "8787:8787"

Verified locally: built the image from this branch with the runtime stage; container boots workers=2, shows two real child processes, and serves /health from both workers. Banner reports the configured values:

  Workers: 2    Concurrency Limit: 25
  Conn Pool: max=33, keepalive=7

Caveats

Known behavior in multi-worker mode, not a regression from this PR but worth calling out for users:

In-memory state (cache, rate limiter buckets, WebSocket registry, display session, recent requests) is per worker.
Dashboard//stats views may appear to change as requests land on different workers.
Worker-aware session management and cross-worker stats aggregation are planned as a separate follow-up issue/branch — intentionally out of scope here to keep this change focused on startup.

Validation

Config round-trip: all 92 ProxyConfig fields survive JSON round-trip.
Full proxy/CLI/health/debug/WS/backpressure targeted test sets pass:
- test_cli_proxy_env.py, test_proxy_scalability.py, test_proxy_healthchecks.py, test_proxy_debug_endpoints.py, test_ws_session_registry.py, test_anthropic_pre_upstream_backpressure.py.
Broader proxy + Codex/WS suite: 273 passed, 103 skipped (skipped are environment-dependent / external).
Runtime startup smoke: 2-worker host startup, two real workers serving /health.
Docker startup smoke: 2-worker container from this branch, two real workers serving /health, env vars honored by headroom proxy.

codecov · 2026-04-26T16:47:51Z

Codecov Report

❌ Patch coverage is 68.00000% with 16 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
headroom/proxy/server.py	55.55%	13 Missing and 3 partials ⚠️

📢 Thoughts on this report? Let us know!

chopratejas · 2026-04-26T16:59:42Z

Fantastic work, @Kayzo - thank you -

There seems to be a linting error - can you help fix that?

Kayzo added 2 commits April 26, 2026 12:25

fix(proxy): support multi-worker Docker env startup

e2d9561

fix(proxy): accept IPv6-mapped loopback hosts

dcf501d

Kayzo mentioned this pull request Apr 26, 2026

Support production multi-worker proxy startup from headroom proxy / Docker env vars #273

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(proxy): support production multi-worker proxy startup + CLI/env tuning#279

feat(proxy): support production multi-worker proxy startup + CLI/env tuning#279
Kayzo wants to merge 2 commits intochopratejas:mainfrom
Kayzo:feature/production-multi-worker-proxy-startup

Kayzo commented Apr 26, 2026

Uh oh!

codecov Bot commented Apr 26, 2026

Uh oh!

chopratejas commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Kayzo commented Apr 26, 2026

Summary

Motivation

Changes

Backwards compatibility

Docker / production usage

Caveats

Validation

Uh oh!

codecov Bot commented Apr 26, 2026

Codecov Report

Uh oh!

chopratejas commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants