feat: af_unix daemon transport + async hyper-util/axum migration#8
Merged
feat: af_unix daemon transport + async hyper-util/axum migration#8
Conversation
void-box now defaults its daemon listener to AF_UNIX with TCP as opt-in. Mirror that contract on the client: pick the transport at construction from the daemon URL scheme, vendor the path-discovery chain so same-uid invocations need no configuration, and fail closed at construction when TCP is configured without a bearer token. AF_UNIX requests deliberately omit the Authorization header to avoid leaking credentials over a transport that doesn't need them.
Bridge and voidctl previously hardcoded http://127.0.0.1:43100 as the daemon URL fallback; with the new daemon default that's an unreachable endpoint. Discover the AF_UNIX socket on the same chain the daemon advertises when VOID_BOX_BASE_URL is unset; explicit overrides still win for cross-uid TCP deployments.
Pure-mock tests carry a synthetic unix:// URL on VoidBoxRunRef so the shape matches production. Live-daemon tests fall back to the discovered AF_UNIX socket when VOID_BOX_BASE_URL is unset, otherwise honor the env var so an operator can still point at a TCP daemon.
Browsers can't speak AF_UNIX, so the dev server proxy must terminate at the void-control bridge (HTTP/TCP) which then dispatches to the daemon over whichever transport the daemon was configured with. Documents the three-process dev workflow and removes the daemon URL from the example env file.
The default deployment shape is AF_UNIX same-uid with auto-discovery; TCP is now opt-in for cross-uid deployments and requires a bearer token. Update quick-start, contract-gate invocation, and environment-variable docs accordingly.
Migrate the optional 'serde' feature off the synchronous tiny_http server onto an async stack: axum 0.8 (HTTP server), tokio 1 (current_thread runtime), hyper 1 + hyper-util 0.1 + hyperlocal 0.9 (legacy Client over TCP and AF_UNIX connectors), http-body-util / bytes (Full<Bytes> request bodies), async-trait (async trait methods on the orchestration runtime trait), futures + tower (axum's MakeService support). Also promote libc to a direct unix-target dep so we can call geteuid() and access(W_OK) idiomatically; the existing extern "C" geteuid block and the OpenOptions write probe come out in a follow-up commit.
Mirrors void-box's daemon_listen.rs::is_writable_dir verbatim. access(2) is
the only portable way to ask the kernel whether the calling uid can write to
a directory: ACLs, mount options like noexec/ro, and per-fs perms all flow
through it. The previous create-then-unlink dance approximated this for the
common case but disagreed in exotic edge cases (ACL-restricted dirs where
O_CREAT|O_EXCL would succeed but bind would fail) and added a probe-counter,
nanos-based filename, and PROBE_COUNTER static that no longer earn their keep.
Drop the extern "C" { fn libc_geteuid } block as well; libc::geteuid() is
the idiomatic call now that libc is a direct dep.
Replace the hand-rolled HTTP/1.1 dialect over std::net::TcpStream and std::os::unix::net::UnixStream with hyper_util::client::legacy::Client over HttpConnector (TCP) and hyperlocal::UnixConnector (AF_UNIX). Both transports drive a Full<Bytes> request body and share one URI-construction code path, matching the void-box CLI backend post-#59. VoidBoxRuntimeClient::start, stop, inspect, list_runs, subscribe_events, fetch_structured_output, fetch_named_artifact, and the internal http_get / http_post / fetch_converted_run / find_manifest_artifact_path helpers all become async fn. The HttpTransport trait gets #[async_trait] so the boxed trait object stays object-safe. The TCP transport unconditionally injects 'Authorization: Bearer <token>' when a token is configured; the AF_UNIX transport never sends an auth header because the daemon authenticates AF_UNIX peers by uid via the kernel's 0o600 perms and an unnecessary credential widens the leak blast radius. Tests migrate to #[tokio::test] where they exercise the async surface; the filter_events_from_id pure-function test stays sync. The two on-the-wire auth-header guards (no Authorization on AF_UNIX, Bearer on TCP) move to httpmock so they exercise the production hyper-util path rather than a hand-built listener thread. Adds an end-to-end TCP test exercising the new transport against httpmock.
ExecutionRuntime gains #[async_trait] on its three I/O-bound methods — start_run, inspect_run, take_structured_output. Pure-compute helpers (persisted_run_handle, inline_poll_*, delivery_run_ref) stay sync. The MessageDeliveryAdapter trait gets the same treatment for inject_at_launch, drain_intents, push_live; the HttpSidecarAdapter impl awaits the underlying hyper-util transport. ExecutionService methods that fan out to the runtime become async fn: run_to_completion, process_execution, dispatch_execution_once, bridge_dispatch_execution_once, plan_execution, the *_claimed helpers, wait_for_terminal_run, dispatch_candidate, resume_running_candidate, try_finalize_candidate_from_ready_output, finalize_candidate_after_terminal_inspection, drain_delivery_intents, collect_candidate_intents, collect_candidate_intents_best_effort, and execute_execution. Pure-compute helpers (plan_execution_claimed, plan_iteration_candidates, materialize_iteration_inboxes, the persistence helpers, the strategy interior) stay sync — they're called freely from async contexts. with_claimed_execution now takes a closure returning a futures::BoxFuture so the await chain reaches inside the claim/refresh/release scope. The refresh watcher stays a std::thread because it does sync filesystem work and we want it to keep running if the surrounding tokio task parks. The MockRuntime impl of ExecutionRuntime is sync-in-async: it wraps existing in-process methods in async fn that don't await anything, so test code that seeds a MockRuntime needs only to add #[tokio::test] and .await at call sites; the deterministic semantics are unchanged. To keep the no-default-features build coherent without dragging in the heavy HTTP stack, async-trait, tokio (rt + macros + time), and futures move from optional 'serde' deps to unconditional deps. The 'serde' feature continues to gate axum, hyper, hyper-util, hyperlocal, http-body-util, bytes, tower, serde, serde_json, serde_yaml, and rustyline.
The 22 routes from PR #6 each get a thin async fn shim that delegates to the existing handle_* helpers (handle_execution_*, handle_team_*, handle_batch_*, handle_template_*, handle_launch). The handlers themselves still build an internal JsonHttpResponse so the HTTP-status-and-body contract stays byte-for-byte compatible with the previous tiny_http server; into_axum is the only adapter that runs at the axum boundary, and it just sets Content-Type: application/json. CORS moves to a single tower_http::cors layer at the router level. run_bridge becomes async and runs both the worker tick and the axum server inside a tokio LocalSet so neither needs to be Send. ExecutionService<R> holds a Box<dyn ProviderLaunchAdapter> with no Send bound, and a current_thread runtime makes Send moot anyway. The worker tick uses spawn_local for the same reason, and tokio::time::sleep replaces the std::thread::sleep so the parking happens on the async timer wheel. handle_launch becomes async to .await the now-async VoidBoxRuntimeClient::start. process_pending_executions_once and the for_test variant become async to .await service.plan_execution / bridge_dispatch_execution_once. The handle_bridge_request test dispatcher and the handle_bridge_request_*_for_test helpers become async; tests under tests/ that call them get .await added in the next test-migration commit. axum's standard with_graceful_shutdown(...) is wired to a SIGINT/SIGTERM hook so in-flight requests drain when the bridge stops. The make_header / to_tiny_response shims and the std::thread import are gone.
Move the binary onto #[tokio::main(flavor = "current_thread")] with the single-thread doc-comment explaining when to revisit (SSE/WebSocket streaming or throughput becoming a bottleneck). bridge_request retires its TcpStream + hand-rolled HTTP/1.1; in its place is a thin async wrapper that dispatches through the same hyper-util legacy Client used by void_box.rs. The bridge client is local to each call rather than shared across the whole CLI invocation; reuse becomes worth wiring through context only when interactive sessions get chatty. run() becomes async fn so the .await chain reaches every callsite. The nested stream_run() helper inside run() is also async fn; std::thread::sleep between polls swaps to tokio::time::sleep so the parking happens on the async timer wheel rather than blocking the executor thread. Eight Result::and_then(|x| bridge_request(...)) shapes won't compose with .await (futures aren't Try), so each gets unfolded into a let-else with an explicit error branch. The other 24 bridge_request callsites just gain .await before the closing match brace. Same treatment for client.start / inspect / subscribe_events / stop / list_runs in the interactive shell — all become .await calls into the now- async VoidBoxRuntimeClient.
The current_thread tokio runtime never moves tasks between threads, so the orchestration trait futures don't need Send. Forcing Send would require the test recorders that hold Rc<RefCell<...>> to switch to Arc<Mutex<...>> for no behavioural reason, and the ExecutionService<R> type holds a Box<dyn ProviderLaunchAdapter> that has no Send bound either. The HttpTransport trait stays Send + Sync: its production impls hold a hyper-util Client, which is Send, and the bridge's axum::Handler bound on each route requires Send-bounded handler futures. The chain HttpTransport (Send) -> VoidBoxRuntimeClient inherent methods (Send) -> axum handler (Send) keeps the wire-level dispatch Send-clean while orchestration internals stay free of Send. Document the runtime-flavor decision in AGENTS.md so the next contributor isn't tempted to rt-multi-thread the bridge without a real reason. Restore the AF_UNIX no-Authorization-header on-the-wire test using a tokio::net::UnixListener one-shot server; httpmock 0.7 has no AF_UNIX support, but the raw read + string-search is sufficient because hyper-util emits a deterministic HTTP/1.1 dialect we can grep. Rewrite the with_claimed_execution refresh test as #[tokio::test(flavor = current_thread)] with spawn_local + tokio::sync::oneshot replacing the std::thread::spawn + std::sync::mpsc shape; the std::thread refresher inside with_claimed_execution still runs on a real OS thread, so the refresh semantics are preserved.
Tests that exercise the now-async ExecutionRuntime, the orchestration service entrypoints (process_execution, dispatch_execution_once, bridge_dispatch_execution_once, plan_execution, run_to_completion), the bridge test helpers (handle_bridge_request_*_for_test), and process_pending_executions_once_for_test become #[tokio::test] async fn, with .await propagated to every callsite. Pure-compute tests (execution_spec_validation, execution_dry_run, batch schema parsing, etc.) stay #[test] sync. Test trait impls of ExecutionRuntime and MessageDeliveryAdapter gain #[async_trait::async_trait(?Send)] and async fn signatures on start_run, inspect_run, take_structured_output, inject_at_launch, drain_intents, push_live. The ?Send variant is required because the recorders use Rc<RefCell<...>>; that's also why the tests run inside a current_thread runtime. execution_bridge_live's pause/cancel actor previously used std::thread::spawn + std::thread::sleep to race the worker tick; rewritten as tokio::task::spawn_local + tokio::time::sleep wrapped in a LocalSet so the actor is like the worker. The extraction keeps the LocalSet outer scope readable. execution_strategy_acceptance's run_mode_to_completion / run_mode_with_all_failures helpers become async fn; their callers gain .await. execution_worker's tick_bridge_worker_until_terminal helper becomes async for the same reason. execution_message_box and execution_message_delivery flip the two service- level tests that fan out to the runtime to #[tokio::test]. void_box_contract's sidecar smoke test (live-only, #[ignore]) becomes #[tokio::test] and awaits inject_at_launch / drain_intents.
Both traits get the standard #[async_trait] (Send-bounded) treatment; their type bound becomes Send. ProviderLaunchAdapter, HttpTransport, and the boxed trait objects in ExecutionService<R> follow suit. The impl<R> ExecutionService<R> block gains an R: ExecutionRuntime + Sync bound so shared-borrow paths (self.runtime.inspect_run() across an .await, etc.) compose cleanly under Send-bounded futures. with_claimed_execution's closure-passed-future swaps from LocalBoxFuture to BoxFuture; T: Send is added so the result reaches the boxed future safely. None of the 4 callsites need rewriting because they were already Box::pin-wrapping Send-only state. This is the conventional shape Rust collaborators expect, and it keeps the trait surface portable across tokio runtime flavors. Flipping voidctl::main to rt-multi-thread later is a one-line macro change rather than a multi-week trait-bound refactor; that's the rationale for paying the (small) cost now while the codebase is small.
Now that ExecutionRuntime, MessageDeliveryAdapter, ProviderLaunchAdapter, and HttpTransport are all Send + Sync, the worker tick is Send and can run under tokio::spawn directly. axum::serve runs alongside, sharing whatever tokio runtime voidctl::main set up. process_pending_executions_once and process_pending_executions_once_for_test gain an R: ExecutionRuntime + Sync bound to match the trait's expectations on cross-await &self borrows. Same edit shape as the impl<R> ExecutionService<R> block in the previous commit. Drops the LocalSet, spawn_local, and run_until choreography. Drops the doc-comment that explained why we needed them — no longer applicable.
Tests of ExecutionRuntime / MessageDeliveryAdapter / ProviderLaunchAdapter need Send + Sync mocks now that the production traits are Send + Sync. execution_message_box's RecordingRuntime / RecordingLaunchAdapter recorders, execution_message_delivery's RecordingDeliveryRuntime / SeededDeliveryRuntime starts vec, and execution_worker's StepwiseRuntime inspect_counts all swap to Arc<Mutex<…>>; .borrow()/.borrow_mut() become .lock().expect(...). The test trait-impl attributes flip from #[async_trait::async_trait(?Send)] to the standard #[async_trait::async_trait] to match the new trait shape. A couple of assertion sites had to be split into a let-bound MutexGuard + drop pattern to satisfy the borrow checker; the .borrow() variants got that for free.
The bridge_pause_resume_and_cancel_work_against_live_daemon test races a pause-then-cancel actor against the worker tick; with the trait surface now Send + Sync, both sides run under tokio::spawn and the LocalSet wrapper goes away. The _inner extraction collapses back into the test fn body.
…+Sync Rewrites the 'Async runtime' section to reflect the new posture: - All async traits are bounded Send + Sync; no LocalSet / spawn_local anywhere; test mocks use Arc<Mutex<…>>. - The runtime flavor stays current_thread on workload grounds (no SSE, no concurrent long-lived connections, no compute-bound parallelism). - The cost of flipping to rt-multi-thread later is a one-line macro change in voidctl::main rather than a trait-bound refactor across the crate. That decoupling is the rationale for paying the Send + Sync cost up front.
The conventional default for HTTP services in Rust, and the long-term direction the user has confirmed. Send + Sync trait bounds across ExecutionRuntime / MessageDeliveryAdapter / HttpTransport / ProviderLaunchAdapter already support multi-thread natively, so the change is purely cosmetic on the type-system side. - Cargo.toml: tokio feature 'rt' -> 'rt-multi-thread' (superset). - src/bin/voidctl.rs: drop the (flavor = "current_thread") arg from #[tokio::main]; idiomatic plain #[tokio::main] picks multi-thread. - src/bridge.rs / src/orchestration/service.rs: doc-comment cleanups. No prose about cross-flavor portability — the current shape just is multi-thread. - AGENTS.md: reframe Async runtime section. Drop the SSE / throughput upgrade-trigger prose; add a one-line note that current_thread remains available via the macro arg if some future workload prefers it.
…avor #[tokio::test] still defaults to current_thread regardless of production flavor; promoting the three tests that spawn parallel tokio tasks gives us runtime parity with the multi-thread bridge so any future Send-issue would surface in test rather than first showing up under load: - src/orchestration/service.rs::claimed_execution_refresh_*: races an OS-thread refresh watcher against a tokio task that holds the claim; cross-thread behaviour is exactly what the test asserts. - src/runtime/void_box.rs::unix_transport_emits_no_authorization_header: spawns a one-shot UnixListener task that the client request races against. - tests/execution_bridge_live.rs::bridge_pause_resume_and_cancel_*: spawns pause / cancel actor tasks that race the worker tick. Pure-compute tests and tests that await a single in-process call stay on the default flavor — promoting them gives no signal.
…:sleep Seven sites in async contexts where std::thread::sleep would block an executor worker thread under rt-multi-thread: - src/bin/voidctl.rs: the ExecutionCommand::Watch poll loop. Companion to stream_run, which was migrated earlier. - tests/execution_bridge_live.rs: four poll-and-wait loops inside #[tokio::test] bodies (the per-execution wait, the parallel multi-execution wait, the cancel-detection loop, the transform-swarm acceptance wait). - tests/execution_worker.rs: the pause and cancel test actors get hoisted from std::thread::spawn (which would still work, but spawns an OS thread per test for nothing) into tokio::spawn with tokio::time::sleep. Now they run on the same tokio runtime as the test body. Sync std::thread::sleep stays in places where the surrounding code is sync: the with_claimed_execution refresh watcher (which runs on a real OS thread by design) and the lib-test wait_for_claim_refresh helper (called from spawn_blocking).
The previous shape rebuilt the client at every worker iteration (bridge.rs:319 in the prior commit) and at startup (bridge.rs:311). Two failure modes: - Misconfigured TCP at startup → panic at construction time → process aborts. Loud, correct. - Misconfigured TCP arising later (e.g. token file disappears) → the per-tick rebuild panics on a spawned tokio task → that task dies silently while axum keeps serving HTTP indefinitely. Building the client once at startup makes the panic path the only path: the only construction site is at bridge boot, where the panic visibly aborts the process. To support the share-via-Arc pattern, VoidBoxRuntimeClient becomes Clone: the boxed transport moves from Box<dyn HttpTransport + Send + Sync> to Arc<dyn HttpTransport + Send + Sync>. Cloning is cheap; the wrapped hyper-util Client already pools connections internally, so all clones share one pool. HttpSidecarAdapter gets the same Arc treatment for consistency, and so a sidecar instance shared across handlers stays pool-warm too. The previous 'pool can recycle if the daemon restarts' rationale was wrong: hyper-util reconnects on its own when a pooled connection goes stale. The per-tick reconstruction added nothing.
- daemon_address: is_writable_dir uses OsStr::as_encoded_bytes() to match
void-box's daemon_listen.rs::is_writable_dir verbatim. Same Unix bytes
on every supported platform; the module's lockstep doc-comment now
reflects what's actually there.
- bridge.rs / voidctl.rs: trim historical-narration prose. The
'previous tiny_http' / 'legacy dispatch helpers' wording becomes
structural ('CORS surface', 'wire-format-compatible'). Doc on the
voidctl bridge client drops the 'legacy' adjective; the upstream type
name still uses it but the doc comment doesn't need to surface it.
- bridge.rs: hoist the in-fn axum / tower-http / std::sync::Arc / tokio
imports to module-scope use statements, matching the codebase
convention. voidctl.rs gains module-scope use statements for the
hyper-util stack used by bridge_request and build_bridge_client; the
big run() body keeps its own scoped imports (rustyline / serde /
void_control) by long-standing pattern.
- void_box.rs: tighten dispatch_unix_url_selects_unix_transport's
assertion. Now matches the structural shape of the error our
send_with_timeout helper produces (code, retryable flag, message
prefix — all set by us explicitly), so the assertion no longer
depends on hyper-util's internal wording.
- service.rs: add a one-line note on the env-mutating
claimed_execution_refresh_* test that VOID_CONTROL_CLAIM_TTL_MS and
VOID_CONTROL_CLAIM_REFRESH_MS are process-global; if a second
env-mutating test lands in this module, both need to share an
env-lock the way daemon_address::tests::with_env does.
Three doc-comment sites in void_box.rs called the upstream type a 'Hyper legacy client' when they really just meant 'pooled hyper-util client'. Match the same wording style applied to voidctl.rs in the previous nit commit.
The previous shape relied on `VoidBoxRuntimeClient::new` resolving a bearer token from the search chain (env vars, then `~/.config/voidbox/daemon-token`). On a developer box that already had the void-box CLI auto-generate a token file, the resolution succeeded silently; on a fresh CI runner with no token file it panicked with "requires a bearer token". Wrap the test body with `with_env` setting `VOIDBOX_DAEMON_TOKEN` so the resolution finds the explicit env var and never touches disk. Switch from `#[tokio::test]` to sync `#[test]` + `Runtime::block_on` because `with_env` holds a `std::sync::Mutex` for env serialization, which must not be held across `.await` on a multi-thread runtime. The block_on is in test code only.
There was a problem hiding this comment.
Pull request overview
Modernizes void-control’s daemon/bridge transport stack to match void-box’s new defaults (AF_UNIX by default; TCP opt-in with bearer token), and propagates async throughout runtime + orchestration so the bridge and CLI can run cleanly on tokio.
Changes:
- Replaced hand-rolled TCP HTTP client/server with async
hyper-utilclients (TCP + AF_UNIX) and anaxumbridge server. - Added
daemon_addressmodule for AF_UNIX socket discovery + TCP token resolution (mirrors void-box). - Migrated orchestration/runtime traits and tests to async +
Send + Sync-friendly mocks.
Reviewed changes
Copilot reviewed 27 out of 28 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| web/void-control-ux/vite.config.ts | Proxy /api to the bridge target (new env override). |
| web/void-control-ux/README.md | Updated dev workflow docs: browser → bridge, bridge → daemon (AF_UNIX/TCP). |
| web/void-control-ux/.env.example | Updated env guidance to focus on bridge as browser endpoint. |
| tests/void_box_contract.rs | Updated live-daemon sidecar smoke test to async adapter calls. |
| tests/team_api.rs | Converted team bridge tests to async handle_bridge_request_* helpers. |
| tests/strategy_scenarios.rs | Converted orchestration scenario tests to async run_to_completion. |
| tests/execution_worker.rs | Converted worker/claiming/bridge tick tests to async + tokio task usage. |
| tests/execution_strategy_acceptance.rs | Converted strategy acceptance tests to async helpers. |
| tests/execution_spec_validation.rs | Converted validation test to async bridge helper. |
| tests/execution_scheduler.rs | Converted scheduler tests to async run_to_completion. |
| tests/execution_message_delivery.rs | Updated delivery adapter tests for async traits + AF_UNIX default run refs. |
| tests/execution_message_box.rs | Updated message box tests for async runtime trait and Arc/Mutex recorders. |
| tests/execution_bridge_live.rs | Updated ignored live bridge tests to resolve AF_UNIX default + async calls. |
| tests/execution_bridge.rs | Converted bridge route tests to async bridge helper. |
| tests/execution_artifact_collection.rs | Converted artifact collection tests to async run_to_completion. |
| tests/batch_api.rs | Converted batch/yolo bridge tests to async bridge helper. |
| src/runtime/void_box.rs | Implemented async daemon transport via hyper-util (TCP + AF_UNIX) with timeouts and token enforcement. |
| src/runtime/mod.rs | Exported daemon_address and updated runtime trait impls to async. |
| src/runtime/http_sidecar.rs | Migrated sidecar adapter to async transport + AF_UNIX default URL support. |
| src/runtime/delivery.rs | Made MessageDeliveryAdapter async + Send + Sync. |
| src/runtime/daemon_address.rs | New module for socket path discovery + TCP bearer token resolution + permission checks. |
| src/orchestration/service.rs | Made ExecutionRuntime async and propagated awaits through execution processing. |
| src/bridge.rs | Replaced tiny_http with axum + CORS + graceful shutdown; made bridge test helpers async. |
| src/bin/voidctl.rs | Entered tokio runtime; migrated bridge client requests to hyper-util; async CLI loops. |
| README.md | Updated daemon default listener docs and local dev setup (bridge required for browser). |
| Cargo.toml | Added tokio/async-trait always-on; gated hyper/axum stack behind serde; added libc on unix. |
| AGENTS.md | Documented async runtime posture + Send/Sync trait expectations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
`VoidBoxRuntimeClient::new` and `HttpSidecarAdapter::with_daemon_url`
panic on misconfigured TCP (no resolvable bearer token). The bridge's
startup-construction site is fine with a panic, but library/embedder
consumers shouldn't have to catch unwinds to handle a wrong daemon URL.
Add `try_new` (returns `Result<Self, String>`) alongside `new`, with
`new` reduced to a one-line `unwrap_or_else { panic!(...) }` over the
fallible variant. Same pattern for `HttpSidecarAdapter`: new
`try_new` and `try_with_daemon_url`, panicking `new` /
`with_daemon_url` retained for the ergonomic 95% case.
The async-runtime documentation in AGENTS.md describes all four runtime traits (`ExecutionRuntime`, `MessageDeliveryAdapter`, `HttpTransport`, `ProviderLaunchAdapter`) as `Send + Sync`. The sibling `MessageDeliveryAdapter` already carries that bound; `ExecutionRuntime` was only `Send`, with use-sites adding `+ Sync` ad-hoc to make the borrow-across-await reasoning hold. Promote the bound to the trait level so the contract is consistent and visible at the declaration. Drop the now-redundant `+ Sync` clauses at `process_pending_executions_once` (`src/bridge.rs:1680`), `process_pending_executions_once_for_test` (`:1803`), and `ExecutionService<R>`'s impl block (`src/orchestration/service.rs:263`). No new constraints in practice — every `ExecutionRuntime` impl in the tree (`MockRuntime`, `VoidBoxRuntimeClient`) is already `Sync`.
… bounds
Three review findings, all in bridge.rs:
1. `CorsLayer::permissive()` was broader than the documented surface
(the doc-comment said `Content-Type` header only + `GET, POST,
PATCH, OPTIONS`; permissive allows arbitrary headers and methods).
Replace with an explicit allow-list matching what the operator UI
actually needs.
2. The axum router had no `fallback` handler, so unknown routes
produced axum's default plain-text 404. The prior tiny_http path
returned a JSON `ApiError { code: "NOT_FOUND", ... }`; clients that
parsed that shape would see a wire-format regression. Add a
`fallback(route_not_found)` mirroring the same JSON shape, plus a
small unit test asserting the status code, content-type, and JSON
body fields.
3. `process_pending_executions_once<R: ExecutionRuntime + Sync>` and
the `_for_test` variant — drop the now-redundant `+ Sync` bound,
since `ExecutionRuntime` is now bounded `Send + Sync` at the trait
level (companion to the orchestration commit).
Vite's `process.env` only carries shell-exported environment, not
values from `.env*` files. The proxy config docs imply
`VITE_VOID_CONTROL_BRIDGE_TARGET` works from a project-local `.env`,
but it didn't — only shell exports reached the config.
Switch to the `defineConfig(({ mode }) => { const env = loadEnv(...) })`
form so `.env`, `.env.local`, etc. are honored as documented.
`bridge_request` constructs a fresh hyper-util client per call. The prior doc-comment claimed the client kept the keep-alive socket warm across calls inside `run()`; that's never been true since the per-call construction landed. Trim the doc-comment to describe the actual behaviour. Sharing the client across calls would require threading state through the deeply-nested interactive shell; the per-CLI-invocation request volume is low enough that the latency win isn't worth the plumbing.
The `claimed_execution_refresh_*` lib test was promoted to the `multi_thread` runtime flavor, which exposes a real OS-thread refresh watcher to the tokio scheduler. On a saturated CI runner the watcher thread can be scheduled-out for hundreds of milliseconds at startup, even though the configured refresh interval is 5ms. The test only asserts that *any* refresh has fired before the timeout. Loosen the ceiling from 200ms to 2s so transient CI scheduling delays don't flake the test, while still catching a real "watcher never fires" regression in well under the test-suite's per-test budget.
The dashboard polls daemon-level run state directly (`/v1/runs?state=…`,
`/v1/runs/{id}/...`); these paths aren't part of the bridge's typed
orchestration surface (`/v1/executions`, `/v1/templates`, …). Pre-PR
the Vite proxy went straight to the daemon, so those calls reached the
daemon natively. Now the proxy goes through the bridge per the
trust-model decision (browser hop terminates at the bridge, AF_UNIX
hop owned by the bridge), so unmatched `/v1/runs` paths 404'd —
4 console errors per UI refresh.
Add a generic passthrough on `VoidBoxRuntimeClient::forward` and wire
two axum routes (`/v1/runs` and `/v1/runs/{*rest}`, both `any` method)
that forward method, full path-and-query, and body to the daemon
unchanged. The daemon's status code and body are relayed verbatim;
transport-level errors map to a 502 with the standard `ApiError` JSON
shape.
`HttpResponse`'s `status` and `body` fields promoted from `pub(crate)`
to `pub` so library consumers can pattern-match on the passthrough
result.
New unit test asserts `forward` preserves method, path, query, and
body across an httpmock round-trip.
The repo had no codified rules for in-source doc comments and inline comments. The void-box AGENTS.md has explicit guidance forbidding historical narration, ticket IDs, private-repo references, and AI-attribution; void-control should match. Add a "Source-code doc and inline comments" subsection under "Documentation expectations" with a concise bullet list. Matches the house style — short imperatives, no long paragraphs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
void-box's daemon recently switched its default listener to AF_UNIX (
0o600, kernel-gated on uid) with TCP becoming opt-in behind a bearer token (the-void-ia/void-box#58), and unified its own CLI's daemon transport onhyper-util(the-void-ia/void-box#59). Until this PR, void-control still spokehttp://127.0.0.1:43100over hand-rolled HTTP/1.1 (std::net::TcpStream+ manual request/response strings) — broken against any fresh void-box build, and architecturally divergent from the rest of the stack.This PR modernizes the void-control side: async hyper-util client (
HttpConnectorfor TCP,hyperlocal::UnixConnectorfor AF_UNIX),axumreplacestiny_httpfor the bridge HTTP server,ExecutionRuntime/MessageDeliveryAdaptertraits become async withSend + Syncbounds, voidctl enters atokiort-multi-threadruntime. The newsrc/runtime/daemon_address.rsmodule mirrors void-box's path-discovery and bearer-token resolution lockstep.void-control isn't in production users yet, so going long-term-correct now (rather than incrementally) avoids a follow-up that would fight the next merged feature for context space.
What changes
Transport (client side)
src/runtime/void_box.rs—TcpHttpTransport/UnixHttpTransportimplemented viahyper_util::client::legacy::Client<HttpConnector|UnixConnector, Full<Bytes>>. URL classification (unix:///absrejects relative paths loudly; barehost:portnormalizes tohttp://). Auth header injected only on TCP — AF_UNIX never carriesAuthorization(verified by wire-level test that reads raw bytes off atokio::net::UnixListener).src/runtime/daemon_address.rs— new module: path-discovery chain ($XDG_RUNTIME_DIR/voidbox.sock→$TMPDIR/voidbox-$UID.sock→/tmp/voidbox-$UID.sock), bearer-token resolution (env-var → file →~/.config/voidbox/daemon-token), owner-only perm check (mode & 0o077 == 0). The chain mirrors void-box'sdaemon_listen.rsexactly; lockstep contract documented at the module level.libcis a direct dep forgeteuidandaccess(W_OK).VoidBoxRuntimeClientis nowClone(transport field isArc<dyn HttpTransport>); the bridge constructs the client once at startup and clones per worker tick, preserving the warm hyper-util connection pool across iterations.Bridge (server side)
src/bridge.rs—tiny_httpswapped foraxum::Routerwith.with_graceful_shutdown(...). SIGINT and SIGTERM both wired. The 22 routes (executions, templates, teams, batch — all introduced in template-first agent APIs and high-level authoring surfaces #6 plus the original execution surface) dispatch through the same handler bodies as before; only the HTTP-framework adapter changed. Status codes and JSON body shapes are byte-identical to the prior tiny_http output (verified by the existing integration tests passing without modification).Async propagation
ExecutionRuntimeandMessageDeliveryAdaptertraits become#[async_trait]withSend + Syncbounds.ExecutionServicemethods that fan out to the runtime are nowasync fn; pure-compute orchestration helpers (planning, persistence, scheduling, reduction, strategies) stay sync. Noblock_on/spawn_blockingin production code.Rc<RefCell<…>>toArc<Mutex<…>>to satisfy the Send bound.Runtime
voidctlis#[tokio::main](default flavor: multi_thread).current_threadremains available via the macro arg if some future workload prefers it.Frontend
/api→ bridge port) rather than the daemon directly. Browsers don't speak AF_UNIX; the bridge does the HTTP→AF_UNIX translation. Same treatment as the production deployment shape.Docs
AGENTS.md— new "Async runtime" section explaining thert-multi-threadchoice and the Send+Sync trait posture.README.md,web/void-control-ux/README.md— daemon-default narrative reframed (AF_UNIX same-uid by default, TCP+token for cross-uid deployments).Behavior implications
daemon_base_urlis now AF_UNIX-resolved viadaemon_address::default_unix_url(). Same-uid invocations need zero configuration. Operators using TCP must provision a bearer token (env var, file, or implicit~/.config/voidbox/daemon-token); the runtime client refuses to construct a TCP transport without one.tokio::signallistens for SIGINT and SIGTERM, axum's.with_graceful_shutdown()drains in-flight requests.Test plan
cargo fmt --all -- --checkclean.cargo clippy --workspace --all-targets --all-features -- -D warningsclean.cargo test --all-features— all tests pass; 29 ignored (live-daemon, unchanged).cargo test --no-default-featuresclean.cargo build --features serde --bin voidctl --releaseclean.httpmockend-to-end (tcp_transport_routes_through_httpmock_end_to_end), AF_UNIX wire-level no-Authorization-header (unix_transport_emits_no_authorization_header), token-file perm rejection, path-discovery chain precedence (4 cases).#[tokio::test(flavor = "multi_thread")]:claimed_execution_refresh_*,unix_transport_emits_no_authorization_header,bridge_pause_resume_and_cancel_*. Other async tests stay on default flavor (current_thread) — single-task awaits don't need multi-thread parity.#[test]migrated to 93#[tokio::test]one-for-one — no coverage loss.Out of scope
claimed_execution_refresh_*lib test still uses a real OS thread for the refresh watcher (production code does too — it's a sync filesystem-IO loop and shouldn't depend on tokio cooperation).VoidBoxRunRef::daemon_base_urlis informational rather than authoritative now; theHttpSidecarAdapterdispatches through the URL it was constructed with. Multi-daemon deployments (one bridge, multiple void-box hosts) would need that field promoted back to authoritative — flagged for future work.tower-httpis added as a dep just forCorsLayer. Future tracing/timeouts/etc. layers would land naturally without additional deps.handle_bridge_request_with_dirs_for_test. An axum-config regression test is a possible follow-up.