Feature/getdents64 by RaymondTana · Pull Request #48 · butter-dot-dev/bVisor

RaymondTana · 2026-02-17T02:04:39Z

Implementation of a getdents64 syscall handler.

Fully virtualized response for procfile backend
Full passthrough for tmp and passthrough backends.
Merge for cow backend: kernel (passthrough) + cow (passthrough) - tombstoned (in-memory)

Required adding tombstone : Tombstones field (an in-memory has set of deleted paths/directories) to the Supervisor.

Also affected the handlers for fstatat64, faccessat, and openat in the cases of CoW and tmp backends.

Summary by cubic

Adds a getdents64 syscall handler that virtualizes directory listings across backends. Adds in-memory tombstones with locking to hide deleted paths, and fixes lseek/dirent offset handling with proper error propagation.

New Features
- getdents64 wired into syscall dispatch.
- Passthrough/tmp: direct kernel read; errors propagated.
- COW: merges kernel + overlay entries, deduped; filters tombstones; per-FD dirents_offset; larger limits; correct d_off; logs when truncated.
- Proc: synthetic listings for /proc, /proc/self, /proc/; directories are DIR; status is REG; statx reports correct modes.
- Tombstones in Supervisor (add/remove/isTombstoned/isChildTombstoned), guarded by mutex.
- dirent helpers for record sizing and serialization.
Refactors
- faccessat/fstatat64/openat honor tombstones; openat clears tombstone on O_CREAT else returns ENOENT.
- File exposes getdents64; handler uses a 4KB buffer, continues for stdio, and locks for COW/proc to avoid races.
- lseek dispatch fixed; seeking to 0 resets dirents_offset.

^{Written for commit 96fcbdb. Summary will update on new commits.}

greptile-apps · 2026-02-17T02:09:55Z

Greptile Summary

Adds comprehensive getdents64 syscall support with tombstone-aware directory listings. The implementation includes virtualized /proc directory listings, COW backend with kernel+overlay merging and deduplication, and passthrough for tmp/passthrough backends. All previous review thread issues have been addressed: d_off now uses sequential offsets, directory/file distinction fixed for /proc/self, truncation is logged, overlay errors propagate, lseek resets dirents_offset, and tombstone access is properly mutex-protected.

Key additions:

New Tombstones tracking system for deleted paths with proper locking
COW getdents64 merges kernel and overlay entries, filters tombstones, with increased limits (2048 entries, 32KB names)
Proc getdents64 synthesizes directory listings for /proc, /proc/self, and /proc/<pid>
File.dirents_offset tracks returned entries per FD, reset on lseek to position 0
openat, faccessat, fstatat64 honor tombstones for COW/tmp paths

Minor issue: tmp backend's getdents64 doesn't filter tombstones (unlike other tmp handlers), though this may be intentional for overlay directories.

Confidence Score: 4/5

Safe to merge with one minor inconsistency in tmp backend tombstone handling
All major issues from previous review threads were addressed. The implementation is thorough with proper mutex locking, comprehensive test coverage, and correct handling of edge cases. One point deducted for tmp backend getdents64 not filtering tombstones while all other tmp handlers do, creating a potential inconsistency where deleted files appear in directory listings but return ENOENT when accessed.
Check src/core/virtual/fs/backend/tmp.zig for tombstone filtering consistency

Important Files Changed

Filename	Overview
src/core/virtual/syscall/handlers/getdents64.zig	New handler implementing getdents64 syscall with proper mutex locking for proc/cow backends and comprehensive test coverage
src/core/virtual/Tombstones.zig	New tombstone tracking system with proper memory management and comprehensive test coverage for file/dir deletion tracking
src/core/virtual/fs/backend/cow.zig	Added getdents64 implementation merging kernel and overlay entries with tombstone filtering, deduplication, and truncation logging
src/core/virtual/fs/backend/procfile.zig	Added getdents64 for /proc directory listings and fixed directory/file distinction for /proc/self and /proc/<pid>
src/core/virtual/fs/File.zig	Added getdents64 dispatcher, dirents_offset tracking field, and lseek now correctly resets dirents_offset to 0 when position is 0
src/core/virtual/syscall/handlers/openat.zig	Added mutex-protected tombstone checks for cow/tmp backends, clearing tombstone on O_CREAT or returning ENOENT
src/core/Supervisor.zig	Added tombstones field with proper init/deinit lifecycle management
src/core/virtual/fs/backend/tmp.zig	Added simple passthrough getdents64 without tombstone filtering (inconsistent with other tmp handlers)

_{Last reviewed commit: 96fcbdb}

greptile-apps

_{14 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-17T02:09:58Z

src/core/virtual/fs/dirent.zig

+    std.mem.writeInt(u64, buf[0..8], ino, .little);
+    std.mem.writeInt(i64, buf[8..16], @intCast(ino), .little);


d_off is set to inode, not directory offset

The d_off field (bytes 8-16) is set to @intCast(ino), which copies the inode number into the directory offset field. In linux_dirent64, d_off should be the offset position for lseek/telldir to locate the next entry — not the inode number.

While this doesn't cause issues today (no code in the sandbox reads d_off), programs like ls may use telldir/seekdir on directory streams, and a non-monotonic d_off could cause them to skip or repeat entries.

Additionally, if ino exceeds maxInt(i64) (e.g., on XFS or btrfs which use large inode numbers), the @intCast(ino) from u64 to i64 will panic at runtime.

Consider using the cumulative byte offset or entry_idx for d_off instead:

Suggested change

std.mem.writeInt(u64, buf[0..8], ino, .little);

std.mem.writeInt(i64, buf[8..16], @intCast(ino), .little);

std.mem.writeInt(u64, buf[0..8], ino, .little);

std.mem.writeInt(i64, buf[8..16], @intCast(ino + 1), .little);

(Using ino + 1 here still isn't ideal. The callers should pass a proper sequential offset as a separate parameter, rather than reusing the inode.)

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/virtual/fs/dirent.zig Line: 13:14 Comment: **`d_off` is set to inode, not directory offset** The `d_off` field (bytes 8-16) is set to `@intCast(ino)`, which copies the inode number into the directory offset field. In `linux_dirent64`, `d_off` should be the offset position for `lseek`/`telldir` to locate the next entry — not the inode number. While this doesn't cause issues today (no code in the sandbox reads `d_off`), programs like `ls` may use `telldir`/`seekdir` on directory streams, and a non-monotonic `d_off` could cause them to skip or repeat entries. Additionally, if `ino` exceeds `maxInt(i64)` (e.g., on XFS or btrfs which use large inode numbers), the `@intCast(ino)` from `u64` to `i64` will panic at runtime. Consider using the cumulative byte offset or `entry_idx` for `d_off` instead: ```suggestion std.mem.writeInt(u64, buf[0..8], ino, .little); std.mem.writeInt(i64, buf[8..16], @intCast(ino + 1), .little); ``` (Using `ino + 1` here still isn't ideal. The callers should pass a proper sequential offset as a separate parameter, rather than reusing the inode.) How can I resolve this? If you propose a fix, please make it concise.

Made it so writeDirent takes a d_off argument which is determined by adding one to the entry_idx used by the caller in each backend.

greptile-apps · 2026-02-17T02:09:59Z

src/core/virtual/fs/backend/procfile.zig

+        dir_proc, // new
+        dir_pid_self, // new
+        dir_pid: NsTgid, // new


dir_pid_self and dir_pid are unreachable dead code

The ProcTarget variants dir_pid_self and dir_pid are defined here and handled in the open switch (line 71), but parseProcPath never returns either of them:

/proc/self returns .self_nstgid (line 39), not .dir_pid_self

/proc/<N> returns .{ .nstgid = nstgid } (line 57), not .{ .dir_pid = nstgid }

These variants are unreachable dead code. If the intent is to distinguish "opening /proc/self as a directory for getdents64" from "reading /proc/self as a file", then parseProcPath needs to be updated to return these variants for the appropriate paths.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/virtual/fs/backend/procfile.zig Line: 17:19 Comment: **`dir_pid_self` and `dir_pid` are unreachable dead code** The `ProcTarget` variants `dir_pid_self` and `dir_pid` are defined here and handled in the `open` switch (line 71), but `parseProcPath` never returns either of them: - `/proc/self` returns `.self_nstgid` (line 39), not `.dir_pid_self` - `/proc/<N>` returns `.{ .nstgid = nstgid }` (line 57), not `.{ .dir_pid = nstgid }` These variants are unreachable dead code. If the intent is to distinguish "opening /proc/self as a directory for getdents64" from "reading /proc/self as a file", then `parseProcPath` needs to be updated to return these variants for the appropriate paths. How can I resolve this? If you propose a fix, please make it concise.

Made use of these for the directory/file distinction mentioned in another comment.

(Linux treats /proc/self as a symlink to the directory /proc/<pid>)

greptile-apps · 2026-02-17T02:10:00Z

src/core/virtual/fs/backend/procfile.zig

+        } else {
+            // /proc/<pid> or /proc/self — list known subfiles
+            // TODO: expand as more /proc/<pid>/* files are virtualized
+            names_buf[name_count] = "status";
+            name_count += 1;
+        }


getdents64 on /proc/self lists "status" but open path routing disagrees

When a file is opened as /proc/self via openat, parseProcPath returns .self_nstgid (not .dir_pid_self), and ProcFile.open fills in the NsTgid content — treating it as a regular file read. However, getdents64 on this same file would take the else branch here (line 146) and return a directory listing with "status".

This means a guest can open("/proc/self"), get back file content containing the PID number, and then call getdents64 on the same fd to get a directory listing of [".", "..", "status"]. That's inconsistent — the same fd behaves as both a regular file and a directory depending on which syscall is used.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/virtual/fs/backend/procfile.zig Line: 146:151 Comment: **`getdents64` on /proc/self lists "status" but open path routing disagrees** When a file is opened as `/proc/self` via `openat`, `parseProcPath` returns `.self_nstgid` (not `.dir_pid_self`), and `ProcFile.open` fills in the NsTgid content — treating it as a regular file read. However, `getdents64` on this same file would take the `else` branch here (line 146) and return a directory listing with `"status"`. This means a guest can `open("/proc/self")`, get back file content containing the PID number, and then call `getdents64` on the same fd to get a directory listing of `[".", "..", "status"]`. That's inconsistent — the same fd behaves as both a regular file and a directory depending on which syscall is used. How can I resolve this? If you propose a fix, please make it concise.

Fixed this by following the kernel's behavior: treat /proc/self as a directory, and /proc/self/status as a file. Affected many tests throughout the codebase

greptile-apps · 2026-02-17T02:10:01Z

src/core/virtual/fs/backend/cow.zig

+        const name = name_bytes[0..null_pos];
+
+        if (name_pos.* + name.len <= name_storage.len) {
+            @memcpy(name_storage[name_pos.*..][0..name.len], name);
+            entries[count.*] = .{
+                .name = name_storage[name_pos.*..][0..name.len],
+                .d_type = d_type,
+            };
+            count.* += 1;
+            name_pos.* += name.len;
+        }


Silent data loss when name storage overflows

If the combined length of all directory entry names exceeds 4096 bytes (the name_storage buffer), entries are silently dropped — the if condition at line 293 is not met and the entry is skipped without any logging or error indication.

With MAX_DIR_ENTRIES = 256, even short names (average ~16 chars) would fill the 4096-byte buffer. Consider logging when this limit is hit, or increasing the buffer size to match the theoretical capacity.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/virtual/fs/backend/cow.zig Line: 291:301 Comment: **Silent data loss when name storage overflows** If the combined length of all directory entry names exceeds 4096 bytes (the `name_storage` buffer), entries are silently dropped — the `if` condition at line 293 is not met and the entry is skipped without any logging or error indication. With `MAX_DIR_ENTRIES = 256`, even short names (average ~16 chars) would fill the 4096-byte buffer. Consider logging when this limit is hit, or increasing the buffer size to match the theoretical capacity. How can I resolve this? If you propose a fix, please make it concise.

Added logging in case any data was truncated. This happens whenever a collectDirents/collectDirentsDedup call processes data (rc > 0) but can't add any more entries, which means we've hit either the MAX_DIR_ENTRIES or MAX_NAME_STORAGE limit

greptile-apps

_{14 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-17T02:15:27Z

src/core/virtual/syscall/handlers/openat.zig

+            // Check tombstones for COW/tmp paths
+            if (h.backend == .cow or h.backend == .tmp) {
+                if (supervisor.tombstones.isTombstoned(h.normalized)) {
+                    if (flags.CREAT) {
+                        supervisor.tombstones.remove(h.normalized);
+                    } else {
+                        return LinuxErr.NOENT;
+                    }
+                }
+            }


Tombstone access without mutex is a data race

supervisor.tombstones is a StringHashMapUnmanaged accessed here without holding supervisor.mutex. Line 90 reads the map (isTombstoned) and line 92 mutates it (remove). Since the Supervisor runs up to MAX_INFLIGHT = 8 concurrent handlers (see Supervisor.zig:78), a concurrent faccessat or fstatat64 handler reading tombstones while this handler calls remove is undefined behavior on the hash map.

The same unprotected read pattern exists in faccessat.zig:78 and fstatat64.zig:117.

This block should be inside a mutex-protected section, or tombstones should use a concurrent-safe data structure.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/virtual/syscall/handlers/openat.zig Line: 88:97 Comment: **Tombstone access without mutex is a data race** `supervisor.tombstones` is a `StringHashMapUnmanaged` accessed here without holding `supervisor.mutex`. Line 90 reads the map (`isTombstoned`) and line 92 mutates it (`remove`). Since the Supervisor runs up to `MAX_INFLIGHT = 8` concurrent handlers (see `Supervisor.zig:78`), a concurrent `faccessat` or `fstatat64` handler reading tombstones while this handler calls `remove` is undefined behavior on the hash map. The same unprotected read pattern exists in `faccessat.zig:78` and `fstatat64.zig:117`. This block should be inside a mutex-protected section, or tombstones should use a concurrent-safe data structure. How can I resolve this? If you propose a fix, please make it concise.

Wrapping in supervisor.mutex whenever accessing

greptile-apps · 2026-02-17T02:15:28Z

src/core/virtual/fs/backend/cow.zig

+        // Reset kernel FD's offset so we always read the full directory
+        _ = linux.lseek(fd, 0, linux.SEEK.SET);


lseek reset discards kernel's directory position

Every call to cow.getdents64 resets the kernel FD offset to 0 with lseek and reads the entire directory into an in-memory array. This works for small directories, but with MAX_DIR_ENTRIES = 256 and name_storage of 4096 bytes, there's a risk for directories with many files:

Directories with >256 entries will be silently truncated — the excess entries are never returned to the caller.

The name_storage buffer (4096 bytes) is shared between kernel and overlay entries. With average filenames of ~16 chars, it fills up at ~256 names, but longer filenames will exhaust it sooner, silently dropping entries.

Consider documenting these limits or logging when entries are dropped.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/virtual/fs/backend/cow.zig Line: 162:163 Comment: **lseek reset discards kernel's directory position** Every call to `cow.getdents64` resets the kernel FD offset to 0 with `lseek` and reads the entire directory into an in-memory array. This works for small directories, but with `MAX_DIR_ENTRIES = 256` and `name_storage` of 4096 bytes, there's a risk for directories with many files: 1. Directories with >256 entries will be silently truncated — the excess entries are never returned to the caller. 2. The `name_storage` buffer (4096 bytes) is shared between kernel and overlay entries. With average filenames of ~16 chars, it fills up at ~256 names, but longer filenames will exhaust it sooner, silently dropping entries. Consider documenting these limits or logging when entries are dropped. How can I resolve this? If you propose a fix, please make it concise.

increasing these limits

…logging, dir/file distinction

RaymondTana · 2026-02-17T03:20:45Z

@greptile review after this latest commit

greptile-apps

_{25 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-17T03:28:47Z

src/core/virtual/fs/backend/cow.zig

+            while (entry_count < MAX_DIR_ENTRIES and name_pos < MAX_NAME_STORAGE) {
+                const rc = linux.getdents64(overlay_fd, &overlay_buf, overlay_buf.len);
+                if (linux.errno(rc) != .SUCCESS or rc == 0) break;


Overlay getdents64 errors silently swallowed

The kernel directory read loop (line 173) propagates errors via checkErr, but the overlay loop here silently ignores getdents64 failures by breaking out of the loop on any non-SUCCESS errno. Since openOverlayDir already returned successfully (the overlay directory exists and was opened), a failure in the subsequent getdents64 call would be unexpected and could mask real I/O issues.

Consider using checkErr here as well (matching the kernel read pattern), or at minimum logging the error:

Suggested change

while (entry_count < MAX_DIR_ENTRIES and name_pos < MAX_NAME_STORAGE) {

const rc = linux.getdents64(overlay_fd, &overlay_buf, overlay_buf.len);

if (linux.errno(rc) != .SUCCESS or rc == 0) break;

const rc = linux.getdents64(overlay_fd, &overlay_buf, overlay_buf.len);

try checkErr(rc, "cow.getdents64 overlay read", .{});

if (rc == 0) break;

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/virtual/fs/backend/cow.zig Line: 194:196 Comment: **Overlay getdents64 errors silently swallowed** The kernel directory read loop (line 173) propagates errors via `checkErr`, but the overlay loop here silently ignores `getdents64` failures by breaking out of the loop on any non-SUCCESS errno. Since `openOverlayDir` already returned successfully (the overlay directory exists and was opened), a failure in the subsequent `getdents64` call would be unexpected and could mask real I/O issues. Consider using `checkErr` here as well (matching the kernel read pattern), or at minimum logging the error: ```suggestion const rc = linux.getdents64(overlay_fd, &overlay_buf, overlay_buf.len); try checkErr(rc, "cow.getdents64 overlay read", .{}); if (rc == 0) break; ``` How can I resolve this? If you propose a fix, please make it concise.

Added this checkErr

greptile-apps · 2026-02-17T03:28:48Z

src/core/virtual/fs/backend/cow.zig

+        var entries: [MAX_DIR_ENTRIES]DirEntry = undefined;
+        var entry_count: usize = 0;


Stack arrays total ~100KB per call

entries is [2048]DirEntry where DirEntry is a struct with a slice (16 bytes) + u8 (padded to 24 bytes), so ~49KB. name_storage is 32KB. Combined with the 4KB kernel_buf and overlay_buf, that's approximately 89KB of stack per getdents64 invocation. With MAX_INFLIGHT = 8 concurrent handlers, this totals ~712KB of stack usage across all simultaneous calls.

This should be fine for typical Linux stack sizes (default 8MB), but it's worth keeping in mind if the limits are ever increased further or if there are other stack-heavy paths in the call chain.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/virtual/fs/backend/cow.zig Line: 162:163 Comment: **Stack arrays total ~100KB per call** `entries` is `[2048]DirEntry` where `DirEntry` is a struct with a slice (16 bytes) + u8 (padded to 24 bytes), so ~49KB. `name_storage` is 32KB. Combined with the 4KB `kernel_buf` and `overlay_buf`, that's approximately 89KB of stack per `getdents64` invocation. With `MAX_INFLIGHT = 8` concurrent handlers, this totals ~712KB of stack usage across all simultaneous calls. This should be fine for typical Linux stack sizes (default 8MB), but it's worth keeping in mind if the limits are ever increased further or if there are other stack-heavy paths in the call chain. <sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub> How can I resolve this? If you propose a fix, please make it concise.

@erik-dunteman what do you think about this?

Currently set

const MAX_DIR_ENTRIES = 2048; // max number of entries we can report in a directory const MAX_NAME_STORAGE = 32768; // total number of bytes we allocate to storing entry names on the stack

These numbers were chosen to balance an approximate average of 16 bytes per file name. Exceeding either limit triggers truncation of the output of getdents64

src/core/virtual/syscall/handlers/getdents64.zig

greptile-apps

_{25 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-17T03:36:13Z

src/core/virtual/fs/File.zig

 ref_count: AtomicUsize = undefined,
 opened_path: ?[]u8 = null,
 open_flags: linux.O = .{},
+dirents_offset: usize = 0,


dirents_offset not reset by lseek/rewinddir

The dirents_offset field tracks how many entries have been returned to the guest, but it is never reset when the guest calls lseek(fd, 0, SEEK_SET) on a directory FD. The lseek handler (lseek.zig:41) dispatches to cow.lseek which resets the kernel FD position, but dirents_offset remains at its old value.

Since cow.getdents64 always resets the kernel FD to position 0 and re-reads the full directory, a guest that does:

getdents64(fd) → entries returned, dirents_offset advances

lseek(fd, 0, SEEK_SET) → kernel position reset, dirents_offset unchanged

getdents64(fd) → all entries skipped because dirents_offset is still advanced → returns 0 (spurious EOF)

This breaks rewinddir() / seekdir() for COW and proc directories. glibc's rewinddir calls lseek(fd, 0, SEEK_SET) and expects subsequent getdents64 to restart from the beginning.

Consider resetting dirents_offset to 0 in File.lseek when whence is SEEK_SET and offset is 0 (or unconditionally for directory-backed files).

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/virtual/fs/File.zig Line: 33:33 Comment: **`dirents_offset` not reset by `lseek`/`rewinddir`** The `dirents_offset` field tracks how many entries have been returned to the guest, but it is never reset when the guest calls `lseek(fd, 0, SEEK_SET)` on a directory FD. The `lseek` handler (`lseek.zig:41`) dispatches to `cow.lseek` which resets the kernel FD position, but `dirents_offset` remains at its old value. Since `cow.getdents64` always resets the kernel FD to position 0 and re-reads the full directory, a guest that does: 1. `getdents64(fd)` → entries returned, `dirents_offset` advances 2. `lseek(fd, 0, SEEK_SET)` → kernel position reset, `dirents_offset` unchanged 3. `getdents64(fd)` → all entries skipped because `dirents_offset` is still advanced → returns 0 (spurious EOF) This breaks `rewinddir()` / `seekdir()` for COW and proc directories. glibc's `rewinddir` calls `lseek(fd, 0, SEEK_SET)` and expects subsequent `getdents64` to restart from the beginning. Consider resetting `dirents_offset` to 0 in `File.lseek` when whence is `SEEK_SET` and offset is 0 (or unconditionally for directory-backed files). How can I resolve this? If you propose a fix, please make it concise.

Fixed lseek dispatch in File.zig to reset dirents_offset to zero whenever the resulting file's position is zero. Only affects directory FDs.

greptile-apps · 2026-02-17T03:36:14Z

src/core/virtual/fs/backend/cow.zig

+    ) !usize {
+        const fd = switch (self.*) {
+            inline else => |backing_fd| backing_fd,
+        };
+
+        const path = dir_path orelse {
+            const rc = linux.getdents64(fd, buf.ptr, buf.len);
+            try checkErr(rc, "cow.getdents64", .{});
+            return rc;
+        };
+
+        const logger = Logger.init(.supervisor);
+
+        var name_storage: [MAX_NAME_STORAGE]u8 = undefined;
+        var name_pos: usize = 0;
+        var truncated = false;
+
+        var entries: [MAX_DIR_ENTRIES]DirEntry = undefined;
+        var entry_count: usize = 0;
+
+        // Reset kernel FD's offset so we always read the full directory
+        _ = linux.lseek(fd, 0, linux.SEEK.SET);
+
+        // Read all kernel directory entries
+        {
+            var kernel_buf: [4096]u8 = undefined;
+            while (entry_count < MAX_DIR_ENTRIES and name_pos < MAX_NAME_STORAGE) {
+                const rc = linux.getdents64(fd, &kernel_buf, kernel_buf.len);
+                try checkErr(rc, "cow.getdents64 kernel read", .{});
+                if (rc == 0) break;
+                const prev_count = entry_count;
+                collectDirents(
+                    kernel_buf[0..rc],
+                    &entries,
+                    &entry_count,
+                    &name_storage,
+                    &name_pos,
+                );
+                if (entry_count == prev_count and rc > 0) {
+                    truncated = true;
+                    break;
+                }
+            }
+        }
+
+        // Read overlay directory entries (if overlay cow dir exists for this path)
+        if (openOverlayDir(overlay, path)) |overlay_fd| {
+            defer _ = linux.close(overlay_fd);
+            var overlay_buf: [4096]u8 = undefined;
+            while (entry_count < MAX_DIR_ENTRIES and name_pos < MAX_NAME_STORAGE) {
+                const rc = linux.getdents64(overlay_fd, &overlay_buf, overlay_buf.len);
+                if (linux.errno(rc) != .SUCCESS or rc == 0) break;
+                const prev_count = entry_count;
+                collectDirentsDedup(
+                    overlay_buf[0..rc],
+                    &entries,
+                    &entry_count,
+                    &name_storage,
+                    &name_pos,
+                );
+                if (entry_count == prev_count and rc > 0) {
+                    truncated = true;
+                    break;
+                }
+            }
+        }
+
+        if (truncated) {
+            logger.log("cow.getdents64: directory listing truncated for {s} (entries={d}, name_bytes={d})", .{ path, entry_count, name_pos });
+        }
+
+        // Serialize entries, skipping already-returned and tombstoned ones
+        var buf_pos: usize = 0;
+        var entry_idx: usize = 0;
+
+        for (entries[0..entry_count]) |entry| {
+            if (entry_idx < dirents_offset.*) {
+                entry_idx += 1;
+                continue;
+            }
+
+            // Don't filter . and .., but filter tombstoned children
+            if (!std.mem.eql(u8, entry.name, ".") and !std.mem.eql(u8, entry.name, "..") and tombstones.isChildTombstoned(path, entry.name)) {
+                entry_idx += 1;
+                dirents_offset.* += 1;
+                continue;
+            }
+
+            const rec_len = dirent.recLen(entry.name.len);
+            // Stop listing if exceeded length of buffer
+            if (buf_pos + rec_len > buf.len) break;
+
+            dirent.writeDirent(
+                buf[buf_pos..],
+                entry_idx + 1,
+                @intCast(entry_idx + 1),
+                @intCast(rec_len),
+                entry.d_type,
+                entry.name,
+            );
+
+            // Shift buffer position, entry index, and offset accordingly
+            buf_pos += rec_len;
+            entry_idx += 1;
+            dirents_offset.* += 1;
+        }
+
+        return buf_pos;
+    }


Mutex held during kernel I/O in COW getdents64

This entire method is called while holding supervisor.mutex (from getdents64.zig:47). It performs multiple blocking kernel syscalls — lseek (line 166), a loop of getdents64 calls on the kernel FD (line 172), potentially another openat + loop of getdents64 for the overlay (lines 191-209). With MAX_INFLIGHT = 8 concurrent handlers, any handler needing the mutex (FD table lookups, tombstone checks, proc operations) will be blocked for the duration of directory reads on potentially large directories.

For now this is acceptable given the early stage, but as more syscalls are added, consider restructuring to release the mutex around the kernel I/O — e.g., read kernel+overlay entries without the mutex, then re-acquire it only for the tombstone filtering pass.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/virtual/fs/backend/cow.zig Line: 145:253 Comment: **Mutex held during kernel I/O in COW getdents64** This entire method is called while holding `supervisor.mutex` (from `getdents64.zig:47`). It performs multiple blocking kernel syscalls — `lseek` (line 166), a loop of `getdents64` calls on the kernel FD (line 172), potentially another `openat` + loop of `getdents64` for the overlay (lines 191-209). With `MAX_INFLIGHT = 8` concurrent handlers, any handler needing the mutex (FD table lookups, tombstone checks, proc operations) will be blocked for the duration of directory reads on potentially large directories. For now this is acceptable given the early stage, but as more syscalls are added, consider restructuring to release the mutex around the kernel I/O — e.g., read kernel+overlay entries without the mutex, then re-acquire it only for the tombstone filtering pass. <sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub> How can I resolve this? If you propose a fix, please make it concise.

RaymondTana · 2026-02-17T03:54:15Z

@greptile check again

greptile-apps

_{25 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-17T03:58:25Z

src/core/virtual/fs/backend/tmp.zig

+    pub fn getdents64(self: *Tmp, buf: []u8) !usize {
+        const rc = linux.getdents64(self.fd, buf.ptr, buf.len);
+        try checkErr(rc, "tmp.getdents64", .{});
+        return rc;
+    }


tmp backend doesn't filter tombstones in getdents64, creating inconsistency: openat, faccessat, and fstatat64 all check tombstones for tmp paths (returning ENOENT for deleted files), but getdents64 will list them. Consider whether tmp should filter tombstones like cow does, or if the passthrough behavior is intentional for tmp overlay directories.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/virtual/fs/backend/tmp.zig Line: 94:98 Comment: tmp backend doesn't filter tombstones in `getdents64`, creating inconsistency: `openat`, `faccessat`, and `fstatat64` all check tombstones for tmp paths (returning ENOENT for deleted files), but `getdents64` will list them. Consider whether tmp should filter tombstones like cow does, or if the passthrough behavior is intentional for tmp overlay directories. How can I resolve this? If you propose a fix, please make it concise.

RaymondTana added 4 commits February 13, 2026 12:28

init, special handling for procfile backend

6c982b7

progress

0c6d8de

merged with main

470cac8

tombstones & getdents64 handler

2d2e5c7

greptile-apps bot reviewed Feb 17, 2026

View reviewed changes

removing raciness, extending cow entries limits, fixing d_off, added …

69cae72

…logging, dir/file distinction

greptile-apps bot reviewed Feb 17, 2026

View reviewed changes

RaymondTana added 2 commits February 16, 2026 19:50

error when linux.getdents64 fails, lseek dispatch fix

30daac7

merge with main

96fcbdb

greptile-apps bot reviewed Feb 17, 2026

View reviewed changes

		std.mem.writeInt(u64, buf[0..8], ino, .little);
		std.mem.writeInt(i64, buf[8..16], @intCast(ino), .little);

		// Reset kernel FD's offset so we always read the full directory
		_ = linux.lseek(fd, 0, linux.SEEK.SET);

		var entries: [MAX_DIR_ENTRIES]DirEntry = undefined;
		var entry_count: usize = 0;

Conversation

RaymondTana commented Feb 17, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Uh oh!

greptile-apps bot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

RaymondTana Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RaymondTana commented Feb 17, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

RaymondTana commented Feb 17, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

RaymondTana commented Feb 17, 2026 •

edited by cubic-dev-ai bot

Loading

greptile-apps bot commented Feb 17, 2026 •

edited

Loading

RaymondTana Feb 17, 2026 •

edited

Loading