fix(virtiofs): raise virtiofsd FD limit to 524288#20
Merged
Conversation
65536 was already being pinned by sbt workloads (see #18): observed 64,943/65,536 on the coursier share with virtiofsd surfacing host EMFILE as guest ENFILE. 524288 matches modern systemd user-session defaults and gives ~8x headroom over the observed peak. This is a short-term mitigation. virtiofsd with --cache=auto still accumulates backing-file FDs monotonically; the durable fix is to stop virtiofs-mounting hot caches (coursier, ivy2, npm, cargo, ...) and move them to guest-native paths. Refs #18. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI (and any locked-down shell with a session hard limit below 524288) hit 'ulimit: cannot modify limit: Operation not permitted'. bash's ulimit -n sets both soft and hard, so target > hard fails with EPERM even when root could raise it. Add a raise_nofile helper that tries ulimit first, falls back to 'sudo prlimit --pid \$\$ --nofile=N:N' to raise the kernel hard limit, then sets the soft limit. sudo is already required by nixbox up (for nftables/tap) and is passwordless on GitHub Actions runners. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR increases the per-process open-file descriptor (NOFILE) limit used when launching virtiofsd so virtiofs-backed shares don’t hit EMFILE/ENFILE under FD-heavy workloads (refs #18).
Changes:
- Raise the NOFILE limit before
virtiofsdspawns in both the boot path (do_up) and hot-plug mount path (do_mount) to 524288. - Introduce a shared
raise_nofilehelper inlib/functions.bash(with asudo prlimitfallback). - Update the e2e test assertion to expect
virtiofsdNOFILE ≥ 524288.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
lib/functions.bash |
Adds raise_nofile helper to set NOFILE (and optionally adjust limits via sudo prlimit). |
bin/nixbox |
Replaces inline ulimit -n 65536 with raise_nofile 524288 before virtiofsd launches. |
tests/run-e2e-tests.sh |
Updates FD-limit assertion to match the new 524288 target. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Don't lower an already-higher hard limit: ulimit -n sets both soft and hard, so target < current_hard would silently downgrade. Read current limits and only raise what's below target; use -Sn for soft-only paths. - Use $BASHPID instead of $$ in sudo prlimit --pid: $$ yields the top- level shell PID even inside subshells, which would target the wrong process. $BASHPID always reflects the current bash. - Reword helper comment: prlimit adjusts per-process rlimits, not a kernel-wide limit like fs.nr_open. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
razvanz
added a commit
that referenced
this pull request
Apr 27, 2026
ADR-015 distinguishes steady-state runtime virtiofs use (forbidden for churning caches) from bounded one-shot bootstrap reads (permitted). scala-sbt adopts the bootstrap pattern: - RO virtiofs mounts at /mnt/host-cache/coursier and /mnt/host-cache/ivy2 (only if host paths exist) - Setup script rsyncs into ~/.cache/coursier and ~/.ivy2 on first boot, guarded by a sentinel file - Subsequent boots: rsync is a no-op; sbt's runtime I/O lives entirely on root.img and never crosses virtiofs - rsync added to nix.packages Trade-off: virtiofsd holds backing-file FDs accumulated during the rsync pass for the VM's lifetime (~50-100k for typical coursier caches), well under the 524288 ceiling from PR #20. No further accumulation occurs since nothing reads the mount after warmup. ADR-015 updated to codify the bootstrap exception with explicit MUST requirements (readonly, side path, sentinel-guarded, no post-warmup access). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
razvanz
added a commit
that referenced
this pull request
Apr 27, 2026
virtiofsd --cache=auto accumulates backing-file FDs monotonically; PR #20 raised the ceiling but didn't change the trajectory. Mounting churning caches (~/.cache/coursier, ~/.ivy2) hit RLIMIT_NOFILE under sbt workloads (#18). Replace the virtiofs mounts with a host-side plugin command following the `nixbox aws login` / `nixbox claude-code sync-config` pattern. `nixbox scala-sbt warm-cache` streams the caches into the guest via `tar | nixbox run "tar -x"`, sentinel-guarded; auto-invoked from a post-up hook; also user-callable for manual refreshes. Caches live entirely on root.img per workspace. ADR-015 documents the stance: in-tree plugins avoid virtiofs for churning caches; third-party plugins are not bound — the FD cost is documented for those who choose otherwise. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
razvanz
added a commit
that referenced
this pull request
Apr 28, 2026
…#22) virtiofsd --cache=auto pinned one O_PATH FD per cached inode (#18); raising RLIMIT_NOFILE (PR #20) didn't fix the trajectory. --inode-file-handles=mandatory swaps FDs for opaque handles, but requires CAP_DAC_READ_SEARCH. ensure_virtiofsd_cap keeps a setcap'd virtiofsd copy at $XDG_DATA_HOME/nixbox/bin, reinstalled on version drift (one sudo per version). Run-as-root alternatives all regressed ADR-001 or ADR-002 — full rationale in docs/decisions/016. Verified: FDSize stays flat across 100k+ unique-inode reads. Memory replaces FDs as the long-run ceiling (~8 KB/store inode, ~60 KB/workspace inode of RSS). Closes #18 --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ulimit -nset before everyvirtiofsdspawn (bothdo_upand the hot-plug path indo_mount) from65536to524288.tests/run-e2e-tests.shto match.Why
The earlier fix (#14) set the ceiling to 65k, but issue #18 documents virtiofsd hitting 64,943/65,536 on the coursier share under an
sbt compilesession — hostEMFILEsurfaces as guestENFILE("Too many open files in system"). 524288 matches typical modern systemd user-session defaults and gives ~8× headroom over the observed peak.Trade-offs
This is a short-term mitigation, not a fix. virtiofsd with
--cache=autoaccumulates backing-file FDs monotonically across a session, so any static ceiling will eventually be hit under a long enough workload. The durable fix is to stop virtiofs-mounting hot caches (coursier, ivy2, npm, cargo, …) and move them to guest-native paths — tracked separately.If the caller's hard limit is below 524288,
set -euo pipefailmakesulimitfail loudly rather than silently degrade. Intentional.Refs #18.
Test plan
shellcheck -x -S warning bin/nixbox lib/functions.bash plugins/*/commands/*.sh plugins/*/scripts/*.sh tests/run-e2e-tests.sh— cleanbats tests/unit/— 40/40 passbash tests/run-nix-tests.sh— 32/32 passtests/run-e2e-tests.sh) — requires KVM, deferred to CI🤖 Generated with Claude Code