Skip to content

Comments

Add Windows daemon mode with named-pipe IPC#200

Open
Snowy7 wants to merge 6 commits intospacedriveapp:mainfrom
Snowy7:windows-daemon-ipc
Open

Add Windows daemon mode with named-pipe IPC#200
Snowy7 wants to merge 6 commits intospacedriveapp:mainfrom
Snowy7:windows-daemon-ipc

Conversation

@Snowy7
Copy link

@Snowy7 Snowy7 commented Feb 24, 2026

Solving Issue #199

Summary

Adding Windows support for Spacebot daemon mode while preserving the existing daemon abstraction and CLI UX.

It includes two parts:

  1. Fix Windows release builds by gating Unix-only daemon code/dependencies.
  2. Implement Windows background daemon support with detached child process startup and named-pipe IPC for start / stop / status.

Why

cargo build --release on Windows was failing because the Unix-only daemonize crate (and related Unix APIs) were being compiled on Windows.

After fixing the build, Windows still lacked production-grade daemon behavior (background start, IPC control path, liveness checks). This PR fills that gap using Windows-native primitives while keeping the current architecture intact.

What Changed

1) Unix-only daemon dependency gating

  • Moved daemonize = "0.5" behind cfg(unix) in Cargo.toml.
  • Kept libc available for existing Unix/other low-level uses.
  • Added windows-sys (Windows-only) for Win32 APIs used by daemon support and disk stats.

2) src/daemon.rs kept as the cross-platform boundary

  • Preserved the public daemon API shape used by src/main.rs.
  • Added cfg(unix) and cfg(windows) backends under the same module.
  • Introduced DaemonStartOptions to pass startup context (config path + debug flag) into platform-specific background launch logic.

3) Windows background daemon startup (detached re-exec)

  • Implemented Windows daemon launch as a detached child process spawn (re-exec), instead of Unix-style fork/daemonize.
  • Uses Windows process creation flags for detached/no-window background execution.
  • Parent exits after spawning, preserving CLI behavior.

4) Windows IPC via named pipes (production path)

  • Implemented daemon control IPC with Tokio Windows named pipes.
  • Preserved the same JSON line protocol used by the Unix socket implementation.
  • Supports the same commands:
    • Shutdown
    • Status
  • start/stop/status CLI flow remains unchanged from user perspective.

5) Windows process liveness checks

  • Implemented PID liveness checks using Win32 process handle APIs (OpenProcess + WaitForSingleObject).
  • Added PID file write support for Windows background daemons.
  • Added stale PID cleanup behavior analogous to Unix path.

6) src/main.rs daemon child path

  • Added hidden internal CLI flag --daemon-child for detached Windows child startup.
  • Parent handles onboarding/config resolution; daemon child skips interactive onboarding.
  • Kept external CLI interface stable (spacebot, spacebot stop, etc.).

7) Windows filesystem storage stats parity

  • Replaced temporary Windows fallback in src/api/system.rs with real disk capacity reporting via GetDiskFreeSpaceExW.
  • storage_status now reports actual total/available bytes on Windows instead of zeros.

8) Docs

  • Added README note clarifying daemon transport is platform-specific:
    • Unix: Unix domain socket
    • Windows: named pipe

Architecture / Design Notes

  • The existing architecture was a good fit for this change because daemon control is already isolated in src/daemon.rs.
  • No changes were needed to the channel/branch/worker/core agent model.
  • The IPC protocol remains consistent across Unix and Windows.
  • This keeps the control plane cross-platform without leaking platform details into runtime logic.

Validation

  • cargo check --release (Windows): ✅ passed

Known Limitations / Follow-ups

  • Validation in this PR is compile-time (cargo check --release) only.
  • I did not run an end-to-end manual runtime smoke test for:
    • spacebot (background)
    • spacebot status
    • spacebot stop
      on Windows in this session.
  • If desired, a follow-up can add an automated Windows integration test around daemon start/status/stop behavior.

Copilot AI review requested due to automatic review settings February 24, 2026 09:49
@coderabbitai
Copy link

coderabbitai bot commented Feb 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds cross-platform daemon control (Unix domain sockets on Unix, named pipes on Windows), platform-specific filesystem usage reporting, expanded daemon lifecycle and IPC types/functions, and a CLI daemon_child flag to adjust onboarding/daemonization for background child processes.

Changes

Cohort / File(s) Summary
Documentation
README.md
Inserted cross-platform daemon-control notes describing IPC transport differences (Unix domain socket vs. Windows named pipe) and that the same start/stop/status CLI flow applies.
Filesystem Usage
src/api/system.rs
Added cfg-gated filesystem usage retrieval: Unix uses statvfs for total/available bytes; Windows uses GetDiskFreeSpaceExW; both compute used bytes via existing directory traversal and include platform-specific error paths and fallbacks.
Daemon IPC & Lifecycle
src/daemon.rs
Implemented cross-platform IPC server/client (Unix sockets and Windows named pipes), JSON IPC types (IpcCommand, IpcResponse), DaemonStartOptions, per-platform start_ipc_server/send_command/daemonize implementations, PID file and stale socket/pipe cleanup, process-liveness checks, and wait_for_exit. Added cfg fallbacks for unsupported targets.
CLI & Startup Flow
src/main.rs
Added daemon_child: bool to the Start subcommand and propagated it through cmd_start; adjusted onboarding and daemonization logic so background child processes skip onboarding and daemonization is conditionally applied.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI / User
    participant Daemon as Daemon Process
    participant IPC as IPC Transport
    participant OS as OS / Kernel

    rect rgba(100,150,200,0.5)
    Note over CLI,Daemon: Daemon startup and listener setup
    CLI->>Daemon: start (may daemonize)
    Daemon->>Daemon: write PID file
    Daemon->>IPC: create listener (socket or pipe)
    IPC->>OS: bind/register transport
    end

    rect rgba(150,200,100,0.5)
    Note over CLI,Daemon: IPC command flow
    CLI->>IPC: connect & send JSON IpcCommand
    IPC->>Daemon: deliver command
    Daemon->>Daemon: handle command (Status / Shutdown)
    Daemon->>IPC: send JSON IpcResponse
    IPC->>CLI: return response
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I dug a socket, then found a pipe,
I hopped through code on day and night,
PID tucked under clover, listeners awake,
JSON whispers travel fast and light,
Hooray — the daemon hums just right.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the primary change: adding Windows daemon mode with named-pipe IPC, which aligns with the main PR objectives and substantial code changes in src/daemon.rs and src/main.rs.
Description check ✅ Passed The description comprehensively covers the changeset: it explains the Windows daemon support implementation, IPC mechanism, dependency gating, and filesystem stats updates, with clear structure and direct relevance to all modified files.
Docstring Coverage ✅ Passed Docstring coverage is 88.46% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

let mut first_server_options = ServerOptions::new();
first_server_options.first_pipe_instance(true);
let mut server = first_server_options
.create(&pipe_name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Windows IPC: it might be worth explicitly restricting the named pipe to the current user (and/or rejecting remote clients, depending on what ServerOptions exposes). As written, any local process that can guess the pipe name can connect and send Shutdown.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

valid hardening concern.

I’ve kept the current PR scoped to cross-platform daemon functionality + reliability fixes, and I don’t want to ship a partial/incorrect Windows ACL implementation in this patch. Tokio’s named pipe defaults already reject remote clients, but local-user/process access control still needs a proper security descriptor/ACL setup.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds Windows daemon mode support to Spacebot, enabling background operation on Windows systems. The implementation uses named pipes for IPC instead of Unix domain sockets, and implements process detachment via re-execution with platform-specific flags. The changes maintain the existing daemon API boundary while adding platform-specific implementations.

Changes:

  • Added Windows-specific daemonization via detached process re-execution with --daemon-child flag
  • Implemented named pipe IPC for Windows daemon control (start/stop/status)
  • Added Windows filesystem capacity reporting using GetDiskFreeSpaceExW API
  • Moved daemonize crate to Unix-only dependency and added windows-sys for Windows

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/main.rs Added daemon_child flag to prevent re-running onboarding and daemonization in child processes
src/daemon.rs Added platform-specific implementations for daemonization, IPC server using named pipes (Windows) or Unix sockets (Unix), and process liveness checks
src/api/system.rs Added Windows implementation of filesystem usage reporting using GetDiskFreeSpaceExW API
Cargo.toml Made daemonize Unix-only dependency and added windows-sys for Windows with required features
Cargo.lock Updated lockfile with windows-sys dependency
README.md Documented that Windows uses named pipes for daemon control

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
Err(error) => {
return Err(error).with_context(|| {
format!("failed to connect to spacebot daemon pipe {pipe_name}")
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with the Unix version, consider changing the error message to match: "failed to connect to spacebot daemon. is it running?" This provides a more user-friendly message that doesn't expose internal implementation details like the pipe name.

Suggested change
format!("failed to connect to spacebot daemon pipe {pipe_name}")
"failed to connect to spacebot daemon. is it running?"

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

including the pipe name helps diagnose local IPC issues during bring-up/debugging (especially on Windows where pipe naming is less obvious than Unix sockets).

Comment on lines +190 to +228
#[cfg(not(unix))]
{
#[cfg(windows)]
{
let mut available_bytes = 0u64;
let mut total_bytes = 0u64;
let mut free_bytes = 0u64;
let mut path_wide: Vec<u16> = path.as_os_str().encode_wide().collect();
path_wide.push(0);

let result = unsafe {
GetDiskFreeSpaceExW(
path_wide.as_ptr(),
&mut available_bytes,
&mut total_bytes,
&mut free_bytes,
)
};
if result == 0 {
return Err(anyhow::anyhow!("GetDiskFreeSpaceExW call failed"));
}

let total_bytes = (block_size * total_blocks) as u64;
let used_bytes = directory_size_bytes(path)?;
let available_bytes = (block_size * avail_blocks) as u64;
let used_bytes = directory_size_bytes(path)?;
return Ok(StorageStatus {
used_bytes,
total_bytes,
available_bytes,
});
}

Ok(StorageStatus {
used_bytes,
total_bytes,
available_bytes,
})
#[cfg(not(windows))]
let used_bytes = directory_size_bytes(path)?;
#[cfg(not(windows))]
Ok(StorageStatus {
used_bytes,
total_bytes: 0,
available_bytes: 0,
})
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nested conditional compilation structure (#[cfg(not(unix))] containing #[cfg(windows)] and #[cfg(not(windows))]) is complex and could be simplified. Consider using mutually exclusive top-level conditions like #[cfg(unix)], #[cfg(windows)], and #[cfg(not(any(unix, windows)))] for better readability and maintainability.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Readability improvement only, behavior is correct and covered by cargo check --release after the platform changes.
Can be done in a follow up.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/daemon.rs (1)

97-134: ⚠️ Potential issue | 🟡 Minor

Windows is_running() lacks pipe verification and can misidentify stale PID entries.

The Windows implementation only checks PID liveness. A reused PID or IPC server startup failure would still return Some(pid), blocking new daemon starts and making stop/status commands misleading. The Unix path verifies socket connectivity; Windows should similarly probe the named pipe. Consider adding the same pattern used in send_command(): attempt to open the pipe and treat ERROR_PIPE_BUSY as running.

🔧 Suggested adjustment (Windows pipe probe)
 #[cfg(not(unix))]
 pub fn is_running(paths: &DaemonPaths) -> Option<u32> {
     let pid = read_pid_file(&paths.pid_file)?;
 
     if !is_process_alive(pid) {
         cleanup_stale_files(paths);
         return None;
     }
 
+    #[cfg(windows)]
+    {
+        let pipe_name = paths.pipe_name();
+        match ClientOptions::new().open(&pipe_name) {
+            Ok(stream) => {
+                drop(stream);
+                return Some(pid);
+            }
+            Err(error) if error.raw_os_error() == Some(ERROR_PIPE_BUSY as i32) => {
+                return Some(pid);
+            }
+            Err(_) => {
+                cleanup_stale_files(paths);
+                return None;
+            }
+        }
+    }
+
     Some(pid)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daemon.rs` around lines 97 - 134, The Windows branch of is_running
currently only checks PID liveness and should also probe the daemon's named pipe
similar to the Unix socket check; update the non-unix is_running implementation
(which calls read_pid_file, is_process_alive, cleanup_stale_files) to attempt
opening the IPC pipe (use the same logic/pattern as send_command()), treat a
successful connect or ERROR_PIPE_BUSY as evidence the daemon is running (drop
the handle and return Some(pid)), and on pipe-open failures treat the PID as
stale by calling cleanup_stale_files(paths) and returning None.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/daemon.rs`:
- Around line 97-134: The Windows branch of is_running currently only checks PID
liveness and should also probe the daemon's named pipe similar to the Unix
socket check; update the non-unix is_running implementation (which calls
read_pid_file, is_process_alive, cleanup_stale_files) to attempt opening the IPC
pipe (use the same logic/pattern as send_command()), treat a successful connect
or ERROR_PIPE_BUSY as evidence the daemon is running (drop the handle and return
Some(pid)), and on pipe-open failures treat the PID as stale by calling
cleanup_stale_files(paths) and returning None.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7170ca0 and fa50253.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock, !**/*.lock
  • Cargo.toml is excluded by !**/*.toml
📒 Files selected for processing (4)
  • README.md
  • src/api/system.rs
  • src/daemon.rs
  • src/main.rs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/daemon.rs (1)

634-643: Non-(unix|windows) builds can still hit cfg-gated Context usage in write_pid_file.

This remains compiled without cfg, but Context is only imported under cfg(any(unix, windows)). Consider gating write_pid_file or importing Context unconditionally.

#!/bin/bash
# Inspect Context import gating and write_pid_file definition.
rg -nC2 "anyhow::Context|write_pid_file"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daemon.rs` around lines 634 - 643, The write_pid_file function uses
anyhow::Context methods but Context is only imported under cfg(any(unix,
windows)), which breaks non-(unix|windows) builds; either gate the
write_pid_file definition with the same cfg(any(unix, windows)) attribute (so it
is only compiled when Context is available) or move the Context import out of
its cfg so it is imported unconditionally; update the module-level import or add
the same cfg attribute to the write_pid_file fn to ensure the symbol Context and
the method with_context are available when the function is compiled.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/daemon.rs`:
- Around line 679-684: The cleanup_stale_files function currently discards
remove_file Results with "let _ =", which violates the no-silent-errors rule;
update cleanup_stale_files to handle errors from std::fs::remove_file on
paths.pid_file (and on paths.socket inside the #[cfg(unix)] block) by either
returning a Result and propagating errors or by logging failures (e.g., using
the crate logger/error macro) with a clear message including the path and error;
replace the "let _ =" calls with proper match/if let Err(e) handlers that log or
propagate the error so filesystem delete failures are not silently ignored.

---

Duplicate comments:
In `@src/daemon.rs`:
- Around line 634-643: The write_pid_file function uses anyhow::Context methods
but Context is only imported under cfg(any(unix, windows)), which breaks
non-(unix|windows) builds; either gate the write_pid_file definition with the
same cfg(any(unix, windows)) attribute (so it is only compiled when Context is
available) or move the Context import out of its cfg so it is imported
unconditionally; update the module-level import or add the same cfg attribute to
the write_pid_file fn to ensure the symbol Context and the method with_context
are available when the function is compiled.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa50253 and 2ac1062.

📒 Files selected for processing (3)
  • src/api/system.rs
  • src/daemon.rs
  • src/main.rs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/daemon.rs (1)

468-471: ⚠️ Potential issue | 🟡 Minor

let _ = std::fs::remove_file(...) silently discards errors.

This violates the "no let _ = on Results" guideline. The cleanup_stale_files function (line 704) already demonstrates the correct pattern — log on failure, ignore NotFound.

Suggested fix
     tokio::spawn(async move {
         let _ = cleanup_rx.wait_for(|shutdown| *shutdown).await;
-        let _ = std::fs::remove_file(&cleanup_socket);
+        if let Err(error) = std::fs::remove_file(&cleanup_socket)
+            && error.kind() != std::io::ErrorKind::NotFound
+        {
+            tracing::warn!(%error, path = %cleanup_socket.display(), "failed to remove socket on shutdown");
+        }
     });

As per coding guidelines, "Don't silently discard errors; no let _ = on Results. Handle, log, or propagate errors."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daemon.rs` around lines 468 - 471, The spawned task currently drops the
Result from std::fs::remove_file(&cleanup_socket) using "let _ =" which silently
discards errors; change the tokio::spawn block that awaits
cleanup_rx.wait_for(|shutdown| *shutdown) to handle the remove_file Result the
same way cleanup_stale_files does: attempt to remove cleanup_socket, and on Err
check for std::io::ErrorKind::NotFound (ignore) otherwise log the error via your
logger (or eprintln!) with context (e.g., "failed to remove cleanup socket") so
failures are visible instead of being discarded; reference the existing
cleanup_stale_files error-handling pattern and functions (cleanup_rx.wait_for
and cleanup_socket) when making the change.
🧹 Nitpick comments (4)
src/daemon.rs (4)

84-91: Move use sha2::Digest as _ before the first usage for readability.

The use item on line 86 is technically visible throughout the block (Rust items are order-independent), but placing it after Sha256::new() — which depends on Digest — is misleading to readers.

Suggested reorder
 #[cfg(windows)]
 fn pipe_name(&self) -> String {
+    use sha2::Digest as _;
     let mut hasher = sha2::Sha256::new();
-    use sha2::Digest as _;
 
     hasher.update(self.pid_file.as_os_str().to_string_lossy().as_bytes());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daemon.rs` around lines 84 - 91, In the pipe_name function move the
import "use sha2::Digest as _" to the top of the function before calling
sha2::Sha256::new(); specifically, place the use statement before the hasher
initialization so Digest is clearly in scope for functions like finalize() and
encode_hex, keeping the rest of pipe_name (hasher.update, finalize,
digest.encode_hex, format!) unchanged.

538-563: Connect phase is bounded, but the command/response exchange has no timeout.

The connect_deadline loop (line 541-560) correctly caps the connect wait at 5 s. However, send_command_over_stream on line 562 can still block indefinitely if the daemon accepts the pipe but never sends a response.

Consider wrapping the whole exchange in tokio::time::timeout:

Sketch
-    send_command_over_stream(stream, command).await
+    tokio::time::timeout(
+        std::time::Duration::from_secs(10),
+        send_command_over_stream(stream, command),
+    )
+    .await
+    .map_err(|_| anyhow!("timed out waiting for daemon IPC response"))?

The same applies to the Unix send_command (line 478-483).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daemon.rs` around lines 538 - 563, The connect loop in send_command
(Windows: send_command(paths: &DaemonPaths, command: IpcCommand)) bounds only
the connection phase but not the subsequent request/response, so wrap the entire
exchange (the ClientOptions::open result followed by the call to
send_command_over_stream(stream, command).await) in a tokio::time::timeout using
a reasonable total timeout (e.g. 5s or configurable) and return a timed-out
error if exceeded; apply the same change to the Unix variant of send_command so
both connection+exchange are time-limited and the function returns an
appropriate anyhow::Error on timeout.

195-222: std::process::exit(0) inside a function returning Result is surprising and skips all caller cleanup.

On Unix, daemonize() returns Ok(()) in the child and the caller drives the rest of the lifecycle. On Windows, the parent hard-exits here, bypassing any Drop guards or cleanup in the call stack. This asymmetry makes the contract of daemonize platform-dependent and fragile.

Consider returning Ok(()) and letting the caller exit, or changing the return type to -> ! / -> anyhow::Result<!> to make the divergence explicit.

One option: return a sentinel and let the caller exit

Return a boolean or enum indicating "I am the parent, you should exit now" versus "I am the child/daemon, continue":

-    eprintln!("spacebot daemon started (pid {})", child.id());
-    std::process::exit(0);
+    eprintln!("spacebot daemon started (pid {})", child.id());
+    // Signal to the caller that the parent should exit cleanly.
+    Err(anyhow!("__parent_exit__"))

Or better, introduce a DaemonizeResult enum with ContinueAsChild / ParentShouldExit { child_pid } variants.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daemon.rs` around lines 195 - 222, The Windows implementation of
daemonize calls std::process::exit(0) which aborts caller cleanup; instead
change daemonize's contract to return a sentinel so the caller decides whether
to exit: define a DaemonizeResult enum (e.g., ContinueAsChild / ParentShouldExit
{ child_pid: u32 }), change fn daemonize(...)->anyhow::Result<DaemonizeResult>,
and in the Windows branch return ParentShouldExit with child.id() rather than
calling std::process::exit; leave the Unix path to return ContinueAsChild so the
caller can perform cleanup/exit as appropriate (update callers of
daemonize/DaemonStartOptions to handle the new enum).

37-37: Use the SYNCHRONIZE constant from windows_sys instead of a magic number.

The windows_sys::Win32::Storage::FileSystem::SYNCHRONIZE constant (value 0x0010_0000) already exists in the crate and should be imported instead of the hardcoded SYNCHRONIZE_ACCESS.

Suggested fix
 #[cfg(windows)]
 use windows_sys::Win32::Foundation::{
     CloseHandle, ERROR_PIPE_BUSY, WAIT_FAILED, WAIT_OBJECT_0, WAIT_TIMEOUT,
 };
 #[cfg(windows)]
+use windows_sys::Win32::Storage::FileSystem::SYNCHRONIZE;
+#[cfg(windows)]
 use windows_sys::Win32::System::Threading::{

Then replace SYNCHRONIZE_ACCESS with SYNCHRONIZE at line 679 and remove the constant definition at line 37.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daemon.rs` at line 37, Replace the hardcoded SYNCHRONIZE_ACCESS const
with the existing constant from windows_sys: remove the const declaration
SYNCHRONIZE_ACCESS and import/use
windows_sys::Win32::Storage::FileSystem::SYNCHRONIZE instead, then replace all
occurrences of SYNCHRONIZE_ACCESS (e.g., the use in the Open* / access flags)
with SYNCHRONIZE so the crate-provided constant is used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/daemon.rs`:
- Around line 593-596: The code uses BufReader::read_line without a cap which
allows a client to stream an unbounded line and cause OOM; fix by enforcing a
maximum line length (e.g. const MAX_LINE_BYTES) and use
reader.take(MAX_LINE_BYTES) when calling AsyncBufReadExt::read_line (e.g. create
a limited_reader via reader.take(MAX_LINE_BYTES) and call read_line on &mut
limited_reader), return an error if the read hits the limit or if the line is
empty/too large; apply the same change to the response path in
send_command_over_stream so both request and response reads are bounded.
- Around line 210-213: On Windows the child process in the `command` builder
uses Stdio::null() for stdout/stderr so early crashes before
`init_background_tracing` produce no diagnostics; open the same log files used
on Unix (the paths handled by the `_paths` parameter) and replace
`stdin/stdout/stderr(std::process::Stdio::null())` with `Stdio::from()` wrappers
around those opened file handles so the detached child writes to the logs, and
remove the unused `_` prefix on ` _paths` (or otherwise use the parameter) so
Windows follows the same logging behavior as the `daemonize` Unix branch.
- Around line 675-696: The Windows implementation of is_process_alive uses
handle.is_null() but in windows-sys 0.59 HANDLE is an integer type (isize-like),
so replace the null-pointer check with handle == 0; also check the BOOL return
from CloseHandle and log an error if it fails instead of discarding it. Locate
the is_process_alive function and change the OpenProcess null-check to use
handle == 0, call CloseHandle(handle) and inspect its return value (FALSE) to
log failures, keeping the existing WaitForSingleObject handling (WAIT_TIMEOUT =>
true, others => false).

---

Outside diff comments:
In `@src/daemon.rs`:
- Around line 468-471: The spawned task currently drops the Result from
std::fs::remove_file(&cleanup_socket) using "let _ =" which silently discards
errors; change the tokio::spawn block that awaits cleanup_rx.wait_for(|shutdown|
*shutdown) to handle the remove_file Result the same way cleanup_stale_files
does: attempt to remove cleanup_socket, and on Err check for
std::io::ErrorKind::NotFound (ignore) otherwise log the error via your logger
(or eprintln!) with context (e.g., "failed to remove cleanup socket") so
failures are visible instead of being discarded; reference the existing
cleanup_stale_files error-handling pattern and functions (cleanup_rx.wait_for
and cleanup_socket) when making the change.

---

Nitpick comments:
In `@src/daemon.rs`:
- Around line 84-91: In the pipe_name function move the import "use sha2::Digest
as _" to the top of the function before calling sha2::Sha256::new();
specifically, place the use statement before the hasher initialization so Digest
is clearly in scope for functions like finalize() and encode_hex, keeping the
rest of pipe_name (hasher.update, finalize, digest.encode_hex, format!)
unchanged.
- Around line 538-563: The connect loop in send_command (Windows:
send_command(paths: &DaemonPaths, command: IpcCommand)) bounds only the
connection phase but not the subsequent request/response, so wrap the entire
exchange (the ClientOptions::open result followed by the call to
send_command_over_stream(stream, command).await) in a tokio::time::timeout using
a reasonable total timeout (e.g. 5s or configurable) and return a timed-out
error if exceeded; apply the same change to the Unix variant of send_command so
both connection+exchange are time-limited and the function returns an
appropriate anyhow::Error on timeout.
- Around line 195-222: The Windows implementation of daemonize calls
std::process::exit(0) which aborts caller cleanup; instead change daemonize's
contract to return a sentinel so the caller decides whether to exit: define a
DaemonizeResult enum (e.g., ContinueAsChild / ParentShouldExit { child_pid: u32
}), change fn daemonize(...)->anyhow::Result<DaemonizeResult>, and in the
Windows branch return ParentShouldExit with child.id() rather than calling
std::process::exit; leave the Unix path to return ContinueAsChild so the caller
can perform cleanup/exit as appropriate (update callers of
daemonize/DaemonStartOptions to handle the new enum).
- Line 37: Replace the hardcoded SYNCHRONIZE_ACCESS const with the existing
constant from windows_sys: remove the const declaration SYNCHRONIZE_ACCESS and
import/use windows_sys::Win32::Storage::FileSystem::SYNCHRONIZE instead, then
replace all occurrences of SYNCHRONIZE_ACCESS (e.g., the use in the Open* /
access flags) with SYNCHRONIZE so the crate-provided constant is used.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ac1062 and e77c274.

📒 Files selected for processing (1)
  • src/daemon.rs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/daemon.rs (1)

442-503: ⚠️ Potential issue | 🟡 Minor

Don’t ignore wait_for errors in the cleanup task.
Line 490 drops the Result from wait_for; log or handle it to avoid silent failures.
As per coding guidelines, “Don’t silently discard errors; no let _ = on Results. Handle, log, or propagate errors. Only exception is .ok() on channel sends where the receiver may be dropped”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daemon.rs` around lines 442 - 503, The cleanup task inside
start_ipc_server currently discards the Result from cleanup_rx.wait_for(...)
with let _ = ..., so change it to capture and handle/log the result (e.g., match
cleanup_rx.wait_for(|shutdown| *shutdown).await { Ok(_) => {}, Err(err) =>
tracing::warn!(%err, "cleanup wait_for failed") }), then proceed to remove the
socket using proper error handling: call std::fs::remove_file(&cleanup_socket)
and if Err(e) and e.kind() != std::io::ErrorKind::NotFound log that error (do
not combine remove_file and kind check with && as currently written); reference
start_ipc_server, cleanup_rx.wait_for, and cleanup_socket to locate the code to
update.
♻️ Duplicate comments (1)
src/daemon.rs (1)

516-567: ⚠️ Potential issue | 🟡 Minor

Restrict named-pipe access to the current user.
Default pipe security allows any local process that can guess the name to send Shutdown; consider tightening ACLs or security attributes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daemon.rs` around lines 516 - 567, The IPC named pipe is created with
default security allowing any local process to connect; update start_ipc_server
to set a restrictive security descriptor/ACL so only the current user can
connect. Concretely, when building ServerOptions (the first_server_options and
the subsequent ServerOptions::new() calls used before create(&pipe_name)), add
the platform-specific security attributes (e.g., set a SECURITY_DESCRIPTOR/SDDL
that grants access only to the current user or call the crate API that sets
security attributes) so create(&pipe_name) produces a pipe limited to the
current user; apply the same secure ServerOptions for both the initial
first_server_options and the next_server creation, and test that
handle_ipc_stream still receives connections from the current user only.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/daemon.rs`:
- Around line 442-503: The cleanup task inside start_ipc_server currently
discards the Result from cleanup_rx.wait_for(...) with let _ = ..., so change it
to capture and handle/log the result (e.g., match cleanup_rx.wait_for(|shutdown|
*shutdown).await { Ok(_) => {}, Err(err) => tracing::warn!(%err, "cleanup
wait_for failed") }), then proceed to remove the socket using proper error
handling: call std::fs::remove_file(&cleanup_socket) and if Err(e) and e.kind()
!= std::io::ErrorKind::NotFound log that error (do not combine remove_file and
kind check with && as currently written); reference start_ipc_server,
cleanup_rx.wait_for, and cleanup_socket to locate the code to update.

---

Duplicate comments:
In `@src/daemon.rs`:
- Around line 516-567: The IPC named pipe is created with default security
allowing any local process to connect; update start_ipc_server to set a
restrictive security descriptor/ACL so only the current user can connect.
Concretely, when building ServerOptions (the first_server_options and the
subsequent ServerOptions::new() calls used before create(&pipe_name)), add the
platform-specific security attributes (e.g., set a SECURITY_DESCRIPTOR/SDDL that
grants access only to the current user or call the crate API that sets security
attributes) so create(&pipe_name) produces a pipe limited to the current user;
apply the same secure ServerOptions for both the initial first_server_options
and the next_server creation, and test that handle_ipc_stream still receives
connections from the current user only.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e77c274 and ad78931.

📒 Files selected for processing (1)
  • src/daemon.rs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/daemon.rs`:
- Around line 134-157: The is_running function treats all errors from
ClientOptions::open as stale and calls cleanup_stale_files, which causes the PID
to be removed when the named pipe is simply not created yet; update is_running
(and its Windows ClientOptions::open handling) to treat ErrorKind::NotFound (or
equivalent "pipe does not exist") as a transient startup state: either skip the
pipe check and return Some(pid) (or retry the open briefly, e.g., a short loop
with a small deadline similar to send_command) instead of calling
cleanup_stale_files, while preserving the existing behavior for ERROR_PIPE_BUSY
and only cleaning up on genuine non-transient errors.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad78931 and 00ff473.

📒 Files selected for processing (1)
  • src/daemon.rs

Comment on lines +134 to +157
#[cfg(not(unix))]
pub fn is_running(paths: &DaemonPaths) -> Option<u32> {
let pid = read_pid_file(&paths.pid_file)?;

if !is_process_alive(pid) {
cleanup_stale_files(paths);
return None;
}

#[cfg(windows)]
{
let pipe_name = paths.pipe_name();
match ClientOptions::new().open(&pipe_name) {
Ok(stream) => {
drop(stream);
Some(pid)
}
Err(error) if error.raw_os_error() == Some(ERROR_PIPE_BUSY as i32) => Some(pid),
Err(_) => {
cleanup_stale_files(paths);
None
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Windows named pipe CreateFile ClientOptions::open error codes FILE_NOT_FOUND vs PIPE_BUSY when pipe does not exist Rust tokio_named_pipes

💡 Result:

On Windows, these two errors mean different things for CreateFile(r"\\.\pipe\NAME", ...):

  • ERROR_FILE_NOT_FOUND (2) / io::ErrorKind::NotFound: there is no named-pipe object with that name right now (i.e., the server hasn’t created any instance yet, or you’re looking in the wrong namespace/name). This is what tokio::net::windows::named_pipe::ClientOptions::open() reports as ErrorKind::NotFound. [2]

  • ERROR_PIPE_BUSY (231): the pipe does exist, but no instance is currently available (commonly: all instances are connected, or the server created instances but isn’t currently waiting/accepting a connection). Microsoft’s guidance is: if CreateFile fails with ERROR_PIPE_BUSY, call WaitNamedPipe (or sleep) and retry. [1] Tokios docs describe the same: ERROR_PIPE_BUSY means “server available, but busy”; retry later. [2]

So if the pipe truly “does not exist” you should get FILE_NOT_FOUND, not PIPE_BUSY. If you’re seeing PIPE_BUSY when you expect “doesn’t exist”, usually one of these is true:

  1. A server process did create that pipe name (maybe a stale/other instance), but it’s saturated/busy. [1]
  2. Your client/server names differ subtly (exact string matters: \\.\pipe\...). [3]
  3. Session/namespace mismatch (e.g., service vs user session), making you think “server isn’t there” when it is (or vice‑versa). [3]

Typical Tokio retry loop (mirrors the docs):

use tokio::net::windows::named_pipe::ClientOptions;
use tokio::time::{sleep, Duration};
use windows_sys::Win32::Foundation::ERROR_PIPE_BUSY;

let name = r"\\.\pipe\my-pipe";
let client = loop {
    match ClientOptions::new().open(name) {
        Ok(c) => break c,
        Err(e) if e.raw_os_error() == Some(ERROR_PIPE_BUSY as i32) => {
            sleep(Duration::from_millis(50)).await;
            continue;
        }
        Err(e) => return Err(e), // includes NotFound (no server)
    }
};

Rule of thumb: treat NotFound as “server not up / wrong name”, and PIPE_BUSY as “server up, but no free instance yet—retry”. [1][2]

Sources: Microsoft “Named Pipe Client” (CreateFile / ERROR_PIPE_BUSY / WaitNamedPipe) [1], Tokio named-pipe client docs for ClientOptions::open error meanings [2], troubleshooting notes on name/timing/session causes [3].


🏁 Script executed:

# Search for daemon startup logic and pipe creation timing
rg "pipe_name|CreateNamedPipe|ClientOptions" src/daemon.rs -B 3 -A 3

# Check if there's any retry or startup sequencing logic
rg "retry|sleep|startup|initialize" src/daemon.rs -i

# Find where the daemon creates the named pipe
fd "*.rs" src | xargs rg "CreateNamedPipe\|NamedPipeServer" -l

Repository: spacedriveapp/spacebot

Length of output: 2872


Treat FILE_NOT_FOUND from named pipe as transient startup state, not stale.

On Windows, ClientOptions::open() returns ErrorKind::NotFound when the pipe doesn't exist yet (the server hasn't created it), distinct from ERROR_PIPE_BUSY which means the pipe exists but is saturated. A race window exists between write_pid_file() and ServerOptions::create() in start_daemon(): if is_running() is called during that gap, the pipe check fails with NotFound, triggering cleanup and deleting the PID file. This allows a second daemon to start.

Currently, only ERROR_PIPE_BUSY is treated as "daemon alive"; all other errors (including NotFound) trigger cleanup_stale_files(), which is too aggressive. Consider either skipping the pipe check if NotFound occurs, or retrying briefly (similar to the 5-second deadline loop in send_command()).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daemon.rs` around lines 134 - 157, The is_running function treats all
errors from ClientOptions::open as stale and calls cleanup_stale_files, which
causes the PID to be removed when the named pipe is simply not created yet;
update is_running (and its Windows ClientOptions::open handling) to treat
ErrorKind::NotFound (or equivalent "pipe does not exist") as a transient startup
state: either skip the pipe check and return Some(pid) (or retry the open
briefly, e.g., a short loop with a small deadline similar to send_command)
instead of calling cleanup_stale_files, while preserving the existing behavior
for ERROR_PIPE_BUSY and only cleaning up on genuine non-transient errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant