Skip to content

feat: af_unix daemon transport + async hyper-util/axum migration#8

Merged
cspinetta merged 38 commits intomainfrom
feat/daemon-unix-transport
Apr 28, 2026
Merged

feat: af_unix daemon transport + async hyper-util/axum migration#8
cspinetta merged 38 commits intomainfrom
feat/daemon-unix-transport

Conversation

@cspinetta
Copy link
Copy Markdown
Member

Summary

void-box's daemon recently switched its default listener to AF_UNIX (0o600, kernel-gated on uid) with TCP becoming opt-in behind a bearer token (the-void-ia/void-box#58), and unified its own CLI's daemon transport on hyper-util (the-void-ia/void-box#59). Until this PR, void-control still spoke http://127.0.0.1:43100 over hand-rolled HTTP/1.1 (std::net::TcpStream + manual request/response strings) — broken against any fresh void-box build, and architecturally divergent from the rest of the stack.

This PR modernizes the void-control side: async hyper-util client (HttpConnector for TCP, hyperlocal::UnixConnector for AF_UNIX), axum replaces tiny_http for the bridge HTTP server, ExecutionRuntime/MessageDeliveryAdapter traits become async with Send + Sync bounds, voidctl enters a tokio rt-multi-thread runtime. The new src/runtime/daemon_address.rs module mirrors void-box's path-discovery and bearer-token resolution lockstep.

void-control isn't in production users yet, so going long-term-correct now (rather than incrementally) avoids a follow-up that would fight the next merged feature for context space.

What changes

Transport (client side)

  • src/runtime/void_box.rsTcpHttpTransport/UnixHttpTransport implemented via hyper_util::client::legacy::Client<HttpConnector|UnixConnector, Full<Bytes>>. URL classification (unix:///abs rejects relative paths loudly; bare host:port normalizes to http://). Auth header injected only on TCP — AF_UNIX never carries Authorization (verified by wire-level test that reads raw bytes off a tokio::net::UnixListener).
  • src/runtime/daemon_address.rs — new module: path-discovery chain ($XDG_RUNTIME_DIR/voidbox.sock$TMPDIR/voidbox-$UID.sock/tmp/voidbox-$UID.sock), bearer-token resolution (env-var → file → ~/.config/voidbox/daemon-token), owner-only perm check (mode & 0o077 == 0). The chain mirrors void-box's daemon_listen.rs exactly; lockstep contract documented at the module level. libc is a direct dep for geteuid and access(W_OK).
  • VoidBoxRuntimeClient is now Clone (transport field is Arc<dyn HttpTransport>); the bridge constructs the client once at startup and clones per worker tick, preserving the warm hyper-util connection pool across iterations.

Bridge (server side)

  • src/bridge.rstiny_http swapped for axum::Router with .with_graceful_shutdown(...). SIGINT and SIGTERM both wired. The 22 routes (executions, templates, teams, batch — all introduced in template-first agent APIs and high-level authoring surfaces #6 plus the original execution surface) dispatch through the same handler bodies as before; only the HTTP-framework adapter changed. Status codes and JSON body shapes are byte-identical to the prior tiny_http output (verified by the existing integration tests passing without modification).

Async propagation

  • ExecutionRuntime and MessageDeliveryAdapter traits become #[async_trait] with Send + Sync bounds. ExecutionService methods that fan out to the runtime are now async fn; pure-compute orchestration helpers (planning, persistence, scheduling, reduction, strategies) stay sync. No block_on/spawn_blocking in production code.
  • Test mocks moved from Rc<RefCell<…>> to Arc<Mutex<…>> to satisfy the Send bound.

Runtime

  • voidctl is #[tokio::main] (default flavor: multi_thread). current_thread remains available via the macro arg if some future workload prefers it.

Frontend

  • Vite dev proxy now targets the void-control bridge (/api → bridge port) rather than the daemon directly. Browsers don't speak AF_UNIX; the bridge does the HTTP→AF_UNIX translation. Same treatment as the production deployment shape.

Docs

  • AGENTS.md — new "Async runtime" section explaining the rt-multi-thread choice and the Send+Sync trait posture.
  • README.md, web/void-control-ux/README.md — daemon-default narrative reframed (AF_UNIX same-uid by default, TCP+token for cross-uid deployments).

Behavior implications

  • Default daemon_base_url is now AF_UNIX-resolved via daemon_address::default_unix_url(). Same-uid invocations need zero configuration. Operators using TCP must provision a bearer token (env var, file, or implicit ~/.config/voidbox/daemon-token); the runtime client refuses to construct a TCP transport without one.
  • HTTP wire format unchanged at both sides. axum bridge handlers produce byte-identical responses to the prior tiny_http path; httpmock-based tests pass without modification.
  • voidctl CLI surface unchanged — same subcommands, same flags, same exit codes.
  • Bridge graceful shutdown is now standard axum: tokio::signal listens for SIGINT and SIGTERM, axum's .with_graceful_shutdown() drains in-flight requests.

Test plan

  • cargo fmt --all -- --check clean.
  • cargo clippy --workspace --all-targets --all-features -- -D warnings clean.
  • cargo test --all-features — all tests pass; 29 ignored (live-daemon, unchanged).
  • cargo test --no-default-features clean.
  • cargo build --features serde --bin voidctl --release clean.
  • New tests cover the transport surface: TCP via httpmock end-to-end (tcp_transport_routes_through_httpmock_end_to_end), AF_UNIX wire-level no-Authorization-header (unix_transport_emits_no_authorization_header), token-file perm rejection, path-discovery chain precedence (4 cases).
  • Runtime-sensitive tests promoted to #[tokio::test(flavor = "multi_thread")]: claimed_execution_refresh_*, unix_transport_emits_no_authorization_header, bridge_pause_resume_and_cancel_*. Other async tests stay on default flavor (current_thread) — single-task awaits don't need multi-thread parity.
  • Pre-existing test count preserved: 93 sync #[test] migrated to 93 #[tokio::test] one-for-one — no coverage loss.
  • Independent code review pass — three should-fix items and five nits, all addressed.

Out of scope

  • The claimed_execution_refresh_* lib test still uses a real OS thread for the refresh watcher (production code does too — it's a sync filesystem-IO loop and shouldn't depend on tokio cooperation).
  • VoidBoxRunRef::daemon_base_url is informational rather than authoritative now; the HttpSidecarAdapter dispatches through the URL it was constructed with. Multi-daemon deployments (one bridge, multiple void-box hosts) would need that field promoted back to authoritative — flagged for future work.
  • tower-http is added as a dep just for CorsLayer. Future tracing/timeouts/etc. layers would land naturally without additional deps.
  • No end-to-end test spawns an axum server and exercises routes via TCP — current tests dispatch directly through handle_bridge_request_with_dirs_for_test. An axum-config regression test is a possible follow-up.

void-box now defaults its daemon listener to AF_UNIX with TCP as opt-in.
Mirror that contract on the client: pick the transport at construction
from the daemon URL scheme, vendor the path-discovery chain so same-uid
invocations need no configuration, and fail closed at construction when
TCP is configured without a bearer token. AF_UNIX requests deliberately
omit the Authorization header to avoid leaking credentials over a
transport that doesn't need them.
Bridge and voidctl previously hardcoded http://127.0.0.1:43100 as the
daemon URL fallback; with the new daemon default that's an unreachable
endpoint. Discover the AF_UNIX socket on the same chain the daemon
advertises when VOID_BOX_BASE_URL is unset; explicit overrides still
win for cross-uid TCP deployments.
Pure-mock tests carry a synthetic unix:// URL on VoidBoxRunRef so the
shape matches production. Live-daemon tests fall back to the discovered
AF_UNIX socket when VOID_BOX_BASE_URL is unset, otherwise honor the env
var so an operator can still point at a TCP daemon.
Browsers can't speak AF_UNIX, so the dev server proxy must terminate at
the void-control bridge (HTTP/TCP) which then dispatches to the daemon
over whichever transport the daemon was configured with. Documents the
three-process dev workflow and removes the daemon URL from the example
env file.
The default deployment shape is AF_UNIX same-uid with auto-discovery;
TCP is now opt-in for cross-uid deployments and requires a bearer token.
Update quick-start, contract-gate invocation, and environment-variable
docs accordingly.
Migrate the optional 'serde' feature off the synchronous tiny_http server
onto an async stack: axum 0.8 (HTTP server), tokio 1 (current_thread runtime),
hyper 1 + hyper-util 0.1 + hyperlocal 0.9 (legacy Client over TCP and
AF_UNIX connectors), http-body-util / bytes (Full<Bytes> request bodies),
async-trait (async trait methods on the orchestration runtime trait), futures
+ tower (axum's MakeService support).

Also promote libc to a direct unix-target dep so we can call geteuid() and
access(W_OK) idiomatically; the existing extern "C" geteuid block and the
OpenOptions write probe come out in a follow-up commit.
Mirrors void-box's daemon_listen.rs::is_writable_dir verbatim. access(2) is
the only portable way to ask the kernel whether the calling uid can write to
a directory: ACLs, mount options like noexec/ro, and per-fs perms all flow
through it. The previous create-then-unlink dance approximated this for the
common case but disagreed in exotic edge cases (ACL-restricted dirs where
O_CREAT|O_EXCL would succeed but bind would fail) and added a probe-counter,
nanos-based filename, and PROBE_COUNTER static that no longer earn their keep.

Drop the extern "C" { fn libc_geteuid } block as well; libc::geteuid() is
the idiomatic call now that libc is a direct dep.
Replace the hand-rolled HTTP/1.1 dialect over std::net::TcpStream and
std::os::unix::net::UnixStream with hyper_util::client::legacy::Client over
HttpConnector (TCP) and hyperlocal::UnixConnector (AF_UNIX). Both transports
drive a Full<Bytes> request body and share one URI-construction code path,
matching the void-box CLI backend post-#59.

VoidBoxRuntimeClient::start, stop, inspect, list_runs, subscribe_events,
fetch_structured_output, fetch_named_artifact, and the internal http_get /
http_post / fetch_converted_run / find_manifest_artifact_path helpers all
become async fn. The HttpTransport trait gets #[async_trait] so the boxed
trait object stays object-safe.

The TCP transport unconditionally injects 'Authorization: Bearer <token>'
when a token is configured; the AF_UNIX transport never sends an auth header
because the daemon authenticates AF_UNIX peers by uid via the kernel's 0o600
perms and an unnecessary credential widens the leak blast radius.

Tests migrate to #[tokio::test] where they exercise the async surface; the
filter_events_from_id pure-function test stays sync. The two on-the-wire
auth-header guards (no Authorization on AF_UNIX, Bearer on TCP) move to
httpmock so they exercise the production hyper-util path rather than a
hand-built listener thread. Adds an end-to-end TCP test exercising the new
transport against httpmock.
ExecutionRuntime gains #[async_trait] on its three I/O-bound methods —
start_run, inspect_run, take_structured_output. Pure-compute helpers
(persisted_run_handle, inline_poll_*, delivery_run_ref) stay sync. The
MessageDeliveryAdapter trait gets the same treatment for inject_at_launch,
drain_intents, push_live; the HttpSidecarAdapter impl awaits the underlying
hyper-util transport.

ExecutionService methods that fan out to the runtime become async fn:
run_to_completion, process_execution, dispatch_execution_once,
bridge_dispatch_execution_once, plan_execution, the *_claimed helpers,
wait_for_terminal_run, dispatch_candidate, resume_running_candidate,
try_finalize_candidate_from_ready_output,
finalize_candidate_after_terminal_inspection, drain_delivery_intents,
collect_candidate_intents, collect_candidate_intents_best_effort, and
execute_execution. Pure-compute helpers (plan_execution_claimed,
plan_iteration_candidates, materialize_iteration_inboxes, the persistence
helpers, the strategy interior) stay sync — they're called freely from
async contexts.

with_claimed_execution now takes a closure returning a futures::BoxFuture so
the await chain reaches inside the claim/refresh/release scope. The refresh
watcher stays a std::thread because it does sync filesystem work and we want
it to keep running if the surrounding tokio task parks.

The MockRuntime impl of ExecutionRuntime is sync-in-async: it wraps existing
in-process methods in async fn that don't await anything, so test code that
seeds a MockRuntime needs only to add #[tokio::test] and .await at call
sites; the deterministic semantics are unchanged.

To keep the no-default-features build coherent without dragging in the heavy
HTTP stack, async-trait, tokio (rt + macros + time), and futures move from
optional 'serde' deps to unconditional deps. The 'serde' feature continues
to gate axum, hyper, hyper-util, hyperlocal, http-body-util, bytes, tower,
serde, serde_json, serde_yaml, and rustyline.
The 22 routes from PR #6 each get a thin async fn shim that delegates to the
existing handle_* helpers (handle_execution_*, handle_team_*, handle_batch_*,
handle_template_*, handle_launch). The handlers themselves still build an
internal JsonHttpResponse so the HTTP-status-and-body contract stays
byte-for-byte compatible with the previous tiny_http server; into_axum is
the only adapter that runs at the axum boundary, and it just sets
Content-Type: application/json. CORS moves to a single tower_http::cors
layer at the router level.

run_bridge becomes async and runs both the worker tick and the axum server
inside a tokio LocalSet so neither needs to be Send. ExecutionService<R>
holds a Box<dyn ProviderLaunchAdapter> with no Send bound, and a
current_thread runtime makes Send moot anyway. The worker tick uses
spawn_local for the same reason, and tokio::time::sleep replaces the
std::thread::sleep so the parking happens on the async timer wheel.

handle_launch becomes async to .await the now-async VoidBoxRuntimeClient::start.
process_pending_executions_once and the for_test variant become async to
.await service.plan_execution / bridge_dispatch_execution_once. The
handle_bridge_request test dispatcher and the handle_bridge_request_*_for_test
helpers become async; tests under tests/ that call them get .await added in
the next test-migration commit.

axum's standard with_graceful_shutdown(...) is wired to a SIGINT/SIGTERM
hook so in-flight requests drain when the bridge stops. The make_header /
to_tiny_response shims and the std::thread import are gone.
Move the binary onto #[tokio::main(flavor = "current_thread")] with the
single-thread doc-comment explaining when to revisit (SSE/WebSocket streaming
or throughput becoming a bottleneck).

bridge_request retires its TcpStream + hand-rolled HTTP/1.1; in its place is
a thin async wrapper that dispatches through the same hyper-util legacy
Client used by void_box.rs. The bridge client is local to each call rather
than shared across the whole CLI invocation; reuse becomes worth wiring
through context only when interactive sessions get chatty.

run() becomes async fn so the .await chain reaches every callsite. The
nested stream_run() helper inside run() is also async fn; std::thread::sleep
between polls swaps to tokio::time::sleep so the parking happens on the
async timer wheel rather than blocking the executor thread.

Eight Result::and_then(|x| bridge_request(...)) shapes won't compose with
.await (futures aren't Try), so each gets unfolded into a let-else with an
explicit error branch. The other 24 bridge_request callsites just gain
.await before the closing match brace.

Same treatment for client.start / inspect / subscribe_events / stop /
list_runs in the interactive shell — all become .await calls into the now-
async VoidBoxRuntimeClient.
The current_thread tokio runtime never moves tasks between threads, so the
orchestration trait futures don't need Send. Forcing Send would require the
test recorders that hold Rc<RefCell<...>> to switch to Arc<Mutex<...>> for
no behavioural reason, and the ExecutionService<R> type holds a
Box<dyn ProviderLaunchAdapter> that has no Send bound either.

The HttpTransport trait stays Send + Sync: its production impls hold a
hyper-util Client, which is Send, and the bridge's axum::Handler bound on
each route requires Send-bounded handler futures. The chain
HttpTransport (Send) -> VoidBoxRuntimeClient inherent methods (Send) ->
axum handler (Send) keeps the wire-level dispatch Send-clean while
orchestration internals stay free of Send.

Document the runtime-flavor decision in AGENTS.md so the next contributor
isn't tempted to rt-multi-thread the bridge without a real reason.

Restore the AF_UNIX no-Authorization-header on-the-wire test using a
tokio::net::UnixListener one-shot server; httpmock 0.7 has no AF_UNIX
support, but the raw read + string-search is sufficient because hyper-util
emits a deterministic HTTP/1.1 dialect we can grep. Rewrite the
with_claimed_execution refresh test as #[tokio::test(flavor = current_thread)]
with spawn_local + tokio::sync::oneshot replacing the std::thread::spawn +
std::sync::mpsc shape; the std::thread refresher inside
with_claimed_execution still runs on a real OS thread, so the refresh
semantics are preserved.
Tests that exercise the now-async ExecutionRuntime, the orchestration
service entrypoints (process_execution, dispatch_execution_once,
bridge_dispatch_execution_once, plan_execution, run_to_completion), the
bridge test helpers (handle_bridge_request_*_for_test), and
process_pending_executions_once_for_test become #[tokio::test] async fn,
with .await propagated to every callsite. Pure-compute tests
(execution_spec_validation, execution_dry_run, batch schema parsing, etc.)
stay #[test] sync.

Test trait impls of ExecutionRuntime and MessageDeliveryAdapter gain
#[async_trait::async_trait(?Send)] and async fn signatures on start_run,
inspect_run, take_structured_output, inject_at_launch, drain_intents,
push_live. The ?Send variant is required because the recorders use
Rc<RefCell<...>>; that's also why the tests run inside a current_thread
runtime.

execution_bridge_live's pause/cancel actor previously used
std::thread::spawn + std::thread::sleep to race the worker tick; rewritten
as tokio::task::spawn_local + tokio::time::sleep wrapped in a LocalSet so
the actor is  like the worker. The  extraction keeps the
LocalSet outer scope readable.

execution_strategy_acceptance's run_mode_to_completion / run_mode_with_all_failures
helpers become async fn; their callers gain .await.

execution_worker's tick_bridge_worker_until_terminal helper becomes async
for the same reason.

execution_message_box and execution_message_delivery flip the two service-
level tests that fan out to the runtime to #[tokio::test].

void_box_contract's sidecar smoke test (live-only, #[ignore]) becomes
#[tokio::test] and awaits inject_at_launch / drain_intents.
Both traits get the standard #[async_trait] (Send-bounded) treatment; their
type bound becomes Send. ProviderLaunchAdapter, HttpTransport, and the boxed
trait objects in ExecutionService<R> follow suit. The impl<R> ExecutionService<R>
block gains an R: ExecutionRuntime + Sync bound so shared-borrow paths
(self.runtime.inspect_run() across an .await, etc.) compose cleanly under
Send-bounded futures.

with_claimed_execution's closure-passed-future swaps from LocalBoxFuture to
BoxFuture; T: Send is added so the result reaches the boxed future safely.
None of the 4 callsites need rewriting because they were already
Box::pin-wrapping Send-only state.

This is the conventional shape Rust collaborators expect, and it keeps the
trait surface portable across tokio runtime flavors. Flipping voidctl::main
to rt-multi-thread later is a one-line macro change rather than a multi-week
trait-bound refactor; that's the rationale for paying the (small) cost now
while the codebase is small.
Now that ExecutionRuntime, MessageDeliveryAdapter, ProviderLaunchAdapter,
and HttpTransport are all Send + Sync, the worker tick is Send and can run
under tokio::spawn directly. axum::serve runs alongside, sharing whatever
tokio runtime voidctl::main set up.

process_pending_executions_once and process_pending_executions_once_for_test
gain an R: ExecutionRuntime + Sync bound to match the trait's expectations
on cross-await &self borrows. Same edit shape as the impl<R> ExecutionService<R>
block in the previous commit.

Drops the LocalSet, spawn_local, and run_until choreography. Drops the
doc-comment that explained why we needed them — no longer applicable.
Tests of ExecutionRuntime / MessageDeliveryAdapter / ProviderLaunchAdapter
need Send + Sync mocks now that the production traits are Send + Sync.

execution_message_box's RecordingRuntime / RecordingLaunchAdapter recorders,
execution_message_delivery's RecordingDeliveryRuntime / SeededDeliveryRuntime
starts vec, and execution_worker's StepwiseRuntime inspect_counts all swap
to Arc<Mutex<…>>; .borrow()/.borrow_mut() become .lock().expect(...). The
test trait-impl attributes flip from #[async_trait::async_trait(?Send)] to
the standard #[async_trait::async_trait] to match the new trait shape.

A couple of assertion sites had to be split into a let-bound MutexGuard +
drop pattern to satisfy the borrow checker; the .borrow() variants got that
for free.
The bridge_pause_resume_and_cancel_work_against_live_daemon test races a
pause-then-cancel actor against the worker tick; with the trait surface now
Send + Sync, both sides run under tokio::spawn and the LocalSet wrapper
goes away. The _inner extraction collapses back into the test fn body.
…+Sync

Rewrites the 'Async runtime' section to reflect the new posture:

- All async traits are bounded Send + Sync; no LocalSet / spawn_local
  anywhere; test mocks use Arc<Mutex<…>>.
- The runtime flavor stays current_thread on workload grounds (no SSE,
  no concurrent long-lived connections, no compute-bound parallelism).
- The cost of flipping to rt-multi-thread later is a one-line macro
  change in voidctl::main rather than a trait-bound refactor across the
  crate. That decoupling is the rationale for paying the Send + Sync
  cost up front.
The conventional default for HTTP services in Rust, and the long-term
direction the user has confirmed. Send + Sync trait bounds across
ExecutionRuntime / MessageDeliveryAdapter / HttpTransport / ProviderLaunchAdapter
already support multi-thread natively, so the change is purely cosmetic
on the type-system side.

- Cargo.toml: tokio feature 'rt' -> 'rt-multi-thread' (superset).
- src/bin/voidctl.rs: drop the (flavor = "current_thread") arg from
  #[tokio::main]; idiomatic plain #[tokio::main] picks multi-thread.
- src/bridge.rs / src/orchestration/service.rs: doc-comment cleanups.
  No prose about cross-flavor portability — the current shape just
  is multi-thread.
- AGENTS.md: reframe Async runtime section. Drop the SSE / throughput
  upgrade-trigger prose; add a one-line note that current_thread remains
  available via the macro arg if some future workload prefers it.
…avor

#[tokio::test] still defaults to current_thread regardless of production
flavor; promoting the three tests that spawn parallel tokio tasks gives us
runtime parity with the multi-thread bridge so any future Send-issue would
surface in test rather than first showing up under load:

- src/orchestration/service.rs::claimed_execution_refresh_*: races an
  OS-thread refresh watcher against a tokio task that holds the claim;
  cross-thread behaviour is exactly what the test asserts.
- src/runtime/void_box.rs::unix_transport_emits_no_authorization_header:
  spawns a one-shot UnixListener task that the client request races against.
- tests/execution_bridge_live.rs::bridge_pause_resume_and_cancel_*:
  spawns pause / cancel actor tasks that race the worker tick.

Pure-compute tests and tests that await a single in-process call stay on
the default flavor — promoting them gives no signal.
…:sleep

Seven sites in async contexts where std::thread::sleep would block an
executor worker thread under rt-multi-thread:

- src/bin/voidctl.rs: the ExecutionCommand::Watch poll loop. Companion to
  stream_run, which was migrated earlier.
- tests/execution_bridge_live.rs: four poll-and-wait loops inside #[tokio::test]
  bodies (the per-execution wait, the parallel multi-execution wait, the
  cancel-detection loop, the transform-swarm acceptance wait).
- tests/execution_worker.rs: the pause and cancel test actors get hoisted
  from std::thread::spawn (which would still work, but spawns an OS thread
  per test for nothing) into tokio::spawn with tokio::time::sleep. Now they
  run on the same tokio runtime as the test body.

Sync std::thread::sleep stays in places where the surrounding code is sync:
the with_claimed_execution refresh watcher (which runs on a real OS thread
by design) and the lib-test wait_for_claim_refresh helper (called from
spawn_blocking).
The previous shape rebuilt the client at every worker iteration (bridge.rs:319
in the prior commit) and at startup (bridge.rs:311). Two failure modes:

- Misconfigured TCP at startup → panic at construction time → process aborts.
  Loud, correct.
- Misconfigured TCP arising later (e.g. token file disappears) → the per-tick
  rebuild panics on a spawned tokio task → that task dies silently while
  axum keeps serving HTTP indefinitely.

Building the client once at startup makes the panic path the only path:
the only construction site is at bridge boot, where the panic visibly
aborts the process.

To support the share-via-Arc pattern, VoidBoxRuntimeClient becomes Clone:
the boxed transport moves from Box<dyn HttpTransport + Send + Sync> to
Arc<dyn HttpTransport + Send + Sync>. Cloning is cheap; the wrapped
hyper-util Client already pools connections internally, so all clones
share one pool. HttpSidecarAdapter gets the same Arc treatment for
consistency, and so a sidecar instance shared across handlers stays
pool-warm too.

The previous 'pool can recycle if the daemon restarts' rationale was
wrong: hyper-util reconnects on its own when a pooled connection goes
stale. The per-tick reconstruction added nothing.
- daemon_address: is_writable_dir uses OsStr::as_encoded_bytes() to match
  void-box's daemon_listen.rs::is_writable_dir verbatim. Same Unix bytes
  on every supported platform; the module's lockstep doc-comment now
  reflects what's actually there.

- bridge.rs / voidctl.rs: trim historical-narration prose. The
  'previous tiny_http' / 'legacy dispatch helpers' wording becomes
  structural ('CORS surface', 'wire-format-compatible'). Doc on the
  voidctl bridge client drops the 'legacy' adjective; the upstream type
  name still uses it but the doc comment doesn't need to surface it.

- bridge.rs: hoist the in-fn axum / tower-http / std::sync::Arc / tokio
  imports to module-scope use statements, matching the codebase
  convention. voidctl.rs gains module-scope use statements for the
  hyper-util stack used by bridge_request and build_bridge_client; the
  big run() body keeps its own scoped imports (rustyline / serde /
  void_control) by long-standing pattern.

- void_box.rs: tighten dispatch_unix_url_selects_unix_transport's
  assertion. Now matches the structural shape of the error our
  send_with_timeout helper produces (code, retryable flag, message
  prefix — all set by us explicitly), so the assertion no longer
  depends on hyper-util's internal wording.

- service.rs: add a one-line note on the env-mutating
  claimed_execution_refresh_* test that VOID_CONTROL_CLAIM_TTL_MS and
  VOID_CONTROL_CLAIM_REFRESH_MS are process-global; if a second
  env-mutating test lands in this module, both need to share an
  env-lock the way daemon_address::tests::with_env does.
Three doc-comment sites in void_box.rs called the upstream type a 'Hyper
legacy client' when they really just meant 'pooled hyper-util client'.
Match the same wording style applied to voidctl.rs in the previous nit
commit.
The previous shape relied on `VoidBoxRuntimeClient::new` resolving a
bearer token from the search chain (env vars, then
`~/.config/voidbox/daemon-token`). On a developer box that already had
the void-box CLI auto-generate a token file, the resolution succeeded
silently; on a fresh CI runner with no token file it panicked with
"requires a bearer token".

Wrap the test body with `with_env` setting `VOIDBOX_DAEMON_TOKEN` so
the resolution finds the explicit env var and never touches disk.
Switch from `#[tokio::test]` to sync `#[test]` + `Runtime::block_on`
because `with_env` holds a `std::sync::Mutex` for env serialization,
which must not be held across `.await` on a multi-thread runtime.
The block_on is in test code only.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Modernizes void-control’s daemon/bridge transport stack to match void-box’s new defaults (AF_UNIX by default; TCP opt-in with bearer token), and propagates async throughout runtime + orchestration so the bridge and CLI can run cleanly on tokio.

Changes:

  • Replaced hand-rolled TCP HTTP client/server with async hyper-util clients (TCP + AF_UNIX) and an axum bridge server.
  • Added daemon_address module for AF_UNIX socket discovery + TCP token resolution (mirrors void-box).
  • Migrated orchestration/runtime traits and tests to async + Send + Sync-friendly mocks.

Reviewed changes

Copilot reviewed 27 out of 28 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
web/void-control-ux/vite.config.ts Proxy /api to the bridge target (new env override).
web/void-control-ux/README.md Updated dev workflow docs: browser → bridge, bridge → daemon (AF_UNIX/TCP).
web/void-control-ux/.env.example Updated env guidance to focus on bridge as browser endpoint.
tests/void_box_contract.rs Updated live-daemon sidecar smoke test to async adapter calls.
tests/team_api.rs Converted team bridge tests to async handle_bridge_request_* helpers.
tests/strategy_scenarios.rs Converted orchestration scenario tests to async run_to_completion.
tests/execution_worker.rs Converted worker/claiming/bridge tick tests to async + tokio task usage.
tests/execution_strategy_acceptance.rs Converted strategy acceptance tests to async helpers.
tests/execution_spec_validation.rs Converted validation test to async bridge helper.
tests/execution_scheduler.rs Converted scheduler tests to async run_to_completion.
tests/execution_message_delivery.rs Updated delivery adapter tests for async traits + AF_UNIX default run refs.
tests/execution_message_box.rs Updated message box tests for async runtime trait and Arc/Mutex recorders.
tests/execution_bridge_live.rs Updated ignored live bridge tests to resolve AF_UNIX default + async calls.
tests/execution_bridge.rs Converted bridge route tests to async bridge helper.
tests/execution_artifact_collection.rs Converted artifact collection tests to async run_to_completion.
tests/batch_api.rs Converted batch/yolo bridge tests to async bridge helper.
src/runtime/void_box.rs Implemented async daemon transport via hyper-util (TCP + AF_UNIX) with timeouts and token enforcement.
src/runtime/mod.rs Exported daemon_address and updated runtime trait impls to async.
src/runtime/http_sidecar.rs Migrated sidecar adapter to async transport + AF_UNIX default URL support.
src/runtime/delivery.rs Made MessageDeliveryAdapter async + Send + Sync.
src/runtime/daemon_address.rs New module for socket path discovery + TCP bearer token resolution + permission checks.
src/orchestration/service.rs Made ExecutionRuntime async and propagated awaits through execution processing.
src/bridge.rs Replaced tiny_http with axum + CORS + graceful shutdown; made bridge test helpers async.
src/bin/voidctl.rs Entered tokio runtime; migrated bridge client requests to hyper-util; async CLI loops.
README.md Updated daemon default listener docs and local dev setup (bridge required for browser).
Cargo.toml Added tokio/async-trait always-on; gated hyper/axum stack behind serde; added libc on unix.
AGENTS.md Documented async runtime posture + Send/Sync trait expectations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/runtime/void_box.rs Outdated
Comment thread src/runtime/http_sidecar.rs Outdated
Comment thread src/orchestration/service.rs Outdated
Comment thread web/void-control-ux/vite.config.ts Outdated
Comment thread src/bin/voidctl.rs
Comment thread src/bridge.rs Outdated
Comment thread src/bridge.rs
`VoidBoxRuntimeClient::new` and `HttpSidecarAdapter::with_daemon_url`
panic on misconfigured TCP (no resolvable bearer token). The bridge's
startup-construction site is fine with a panic, but library/embedder
consumers shouldn't have to catch unwinds to handle a wrong daemon URL.

Add `try_new` (returns `Result<Self, String>`) alongside `new`, with
`new` reduced to a one-line `unwrap_or_else { panic!(...) }` over the
fallible variant. Same pattern for `HttpSidecarAdapter`: new
`try_new` and `try_with_daemon_url`, panicking `new` /
`with_daemon_url` retained for the ergonomic 95% case.
The async-runtime documentation in AGENTS.md describes all four
runtime traits (`ExecutionRuntime`, `MessageDeliveryAdapter`,
`HttpTransport`, `ProviderLaunchAdapter`) as `Send + Sync`. The sibling
`MessageDeliveryAdapter` already carries that bound; `ExecutionRuntime`
was only `Send`, with use-sites adding `+ Sync` ad-hoc to make the
borrow-across-await reasoning hold.

Promote the bound to the trait level so the contract is consistent and
visible at the declaration. Drop the now-redundant `+ Sync` clauses at
`process_pending_executions_once` (`src/bridge.rs:1680`),
`process_pending_executions_once_for_test` (`:1803`), and
`ExecutionService<R>`'s impl block (`src/orchestration/service.rs:263`).
No new constraints in practice — every `ExecutionRuntime` impl in the
tree (`MockRuntime`, `VoidBoxRuntimeClient`) is already `Sync`.
… bounds

Three review findings, all in bridge.rs:

1. `CorsLayer::permissive()` was broader than the documented surface
   (the doc-comment said `Content-Type` header only + `GET, POST,
   PATCH, OPTIONS`; permissive allows arbitrary headers and methods).
   Replace with an explicit allow-list matching what the operator UI
   actually needs.

2. The axum router had no `fallback` handler, so unknown routes
   produced axum's default plain-text 404. The prior tiny_http path
   returned a JSON `ApiError { code: "NOT_FOUND", ... }`; clients that
   parsed that shape would see a wire-format regression.  Add a
   `fallback(route_not_found)` mirroring the same JSON shape, plus a
   small unit test asserting the status code, content-type, and JSON
   body fields.

3. `process_pending_executions_once<R: ExecutionRuntime + Sync>` and
   the `_for_test` variant — drop the now-redundant `+ Sync` bound,
   since `ExecutionRuntime` is now bounded `Send + Sync` at the trait
   level (companion to the orchestration commit).
Vite's `process.env` only carries shell-exported environment, not
values from `.env*` files. The proxy config docs imply
`VITE_VOID_CONTROL_BRIDGE_TARGET` works from a project-local `.env`,
but it didn't — only shell exports reached the config.

Switch to the `defineConfig(({ mode }) => { const env = loadEnv(...) })`
form so `.env`, `.env.local`, etc. are honored as documented.
`bridge_request` constructs a fresh hyper-util client per call. The
prior doc-comment claimed the client kept the keep-alive socket warm
across calls inside `run()`; that's never been true since the per-call
construction landed.

Trim the doc-comment to describe the actual behaviour. Sharing the
client across calls would require threading state through the
deeply-nested interactive shell; the per-CLI-invocation request
volume is low enough that the latency win isn't worth the plumbing.
The `claimed_execution_refresh_*` lib test was promoted to the
`multi_thread` runtime flavor, which exposes a real OS-thread refresh
watcher to the tokio scheduler. On a saturated CI runner the watcher
thread can be scheduled-out for hundreds of milliseconds at startup,
even though the configured refresh interval is 5ms.

The test only asserts that *any* refresh has fired before the
timeout. Loosen the ceiling from 200ms to 2s so transient CI scheduling
delays don't flake the test, while still catching a real "watcher
never fires" regression in well under the test-suite's per-test
budget.
The dashboard polls daemon-level run state directly (`/v1/runs?state=…`,
`/v1/runs/{id}/...`); these paths aren't part of the bridge's typed
orchestration surface (`/v1/executions`, `/v1/templates`, …). Pre-PR
the Vite proxy went straight to the daemon, so those calls reached the
daemon natively. Now the proxy goes through the bridge per the
trust-model decision (browser hop terminates at the bridge, AF_UNIX
hop owned by the bridge), so unmatched `/v1/runs` paths 404'd —
4 console errors per UI refresh.

Add a generic passthrough on `VoidBoxRuntimeClient::forward` and wire
two axum routes (`/v1/runs` and `/v1/runs/{*rest}`, both `any` method)
that forward method, full path-and-query, and body to the daemon
unchanged. The daemon's status code and body are relayed verbatim;
transport-level errors map to a 502 with the standard `ApiError` JSON
shape.

`HttpResponse`'s `status` and `body` fields promoted from `pub(crate)`
to `pub` so library consumers can pattern-match on the passthrough
result.

New unit test asserts `forward` preserves method, path, query, and
body across an httpmock round-trip.
The repo had no codified rules for in-source doc comments and inline
comments. The void-box AGENTS.md has explicit guidance forbidding
historical narration, ticket IDs, private-repo references, and
AI-attribution; void-control should match.

Add a "Source-code doc and inline comments" subsection under
"Documentation expectations" with a concise bullet list. Matches the
house style — short imperatives, no long paragraphs.
@cspinetta cspinetta merged commit 935f8f5 into main Apr 28, 2026
6 checks passed
@cspinetta cspinetta deleted the feat/daemon-unix-transport branch April 28, 2026 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants