Skip to content

feat: cgroup leaf-move + ResourceLimiter wrapper#214

Merged
erans merged 29 commits intomainfrom
feat/cgroup-leaf-move-cleanup
Apr 10, 2026
Merged

feat: cgroup leaf-move + ResourceLimiter wrapper#214
erans merged 29 commits intomainfrom
feat/cgroup-leaf-move-cleanup

Conversation

@erans
Copy link
Copy Markdown
Collaborator

@erans erans commented Apr 10, 2026

Summary

  • Auto leaf-move on EBUSY: When subtree_control fails with EBUSY (process in own cgroup), create agentsh.leaf child, move process there, retry — standard systemd pattern
  • ResourceLimiter wrapper: Replace noop stub with cgroupResourceLimiter that lazily creates CgroupManager and delegates Apply() calls via ResourceHandle
  • Dead code removal: Delete unused LinuxLimiter, old platform ResourceLimiter, deprecated ApplyCgroupV2
  • Probe idempotency: Auto-discovered paths ending in agentsh.leaf stripped to parent; explicit absolute hints preserved
  • eBPF test cleanup: Save/restore original cgroup, surface cleanup errors

Test plan

  • go test ./internal/limits/ — 15 probe tests including 6 leaf-move scenarios
  • go test ./internal/platform/linux/ — 7 new ResourceLimiter lifecycle tests
  • go test ./... — full suite passes
  • GOOS=windows go build ./... — cross-compilation clean
  • CI build

🤖 Generated with Claude Code

erans and others added 29 commits April 10, 2026 16:12
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Clarify that the one-shot error injection only applies to WriteString,
not OpenFile. Addresses roborev finding about misleading contract.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Refactor tryLeafMove to return (moved, enabled) so the caller can set
LeafMoved=true even when enable retry fails — the process relocation is
a real side-effect that telemetry must capture.

Replace nil return from linux Platform.Resources() with a
noopResourceLimiter struct that satisfies the interface contract
(Available() == false) so integration tests don't NPE.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Tie CanLimit* capability flags to HasCgroups probe result instead of
  hardcoding true, so capabilities match the noopResourceLimiter contract.
  Move checkCgroups into the concurrent detection goroutine pool.

- Return retry error from tryLeafMove so classifyEnableError reports
  the actual retry failure, not the original EBUSY.

- Fix eBPF integration tests to pass cgDir (the real cgroup path) to
  AttachConnectToCgroup/CgroupID/PNACLMonitorConfig and defer cleanup
  of cgDir instead of the temp name-derivation path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Clarify OwnCgroup field: it is the enforcement root (parent for
  child cgroups), not the process's current location. When LeafMoved
  is true, the process resides in OwnCgroup/leaf.

- tryLeafMove now returns the actual failure error (mkdir or
  cgroup.procs write) so the caller surfaces the real cause in the
  reason string instead of generic EBUSY.

- Set CanLimit* = false on Linux to match Resources().Available() ==
  false. HasCgroups still reports system cgroup v2 availability;
  actual enforcement uses CgroupManager at the server level.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Normalize discovered cgroup paths: if the process is in a /leaf
sub-cgroup from a prior probe's leaf-move, strip the suffix so the
probe operates on the parent. Prevents leaf/leaf creation when the
probe runs twice in the same process (e.g. capabilities.CheckAll
then NewCgroupManager).

Add regression test TestProbe_LeafMove_IdempotentSecondProbe that
simulates two probes and verifies leaf/leaf is never created.

Fix eBPF integration tests to clean up cgDir (the real cgroup path)
before creation instead of an unused temp path, making the tests
repeatable after interrupted runs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Restructure leaf normalization: strip trailing /leaf after path
resolution (not just auto-discovered paths) since "leaf" is
exclusively our synthetic name. Document why this is safe.

Track leafResident flag: when normalization detects the process is
already in a leaf cgroup (from a prior probe), propagate LeafMoved=true
through all result paths. This ensures the second probe on the normal
startup flow (NewCgroupManager after CheckAll) accurately reports the
process's leaf residency.

Update LeafMoved doc: "true if the process resides in OwnCgroup/leaf"
rather than "true if this probe performed the move."

Update idempotency test to verify LeafMoved=true on second probe.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Restrict leaf normalization to auto-discovered cgroup paths only
  (ownHint="" or relative BasePath resolution). Explicit absolute
  ownHint paths are never rewritten, preserving the BasePath contract.

- Restore CanLimit* = hasCgroups (system-level capability reporting).
  These describe whether the OS supports resource limiting, not
  whether the platform.ResourceLimiter interface implements it.

- Fix eBPF integration tests to save the original cgroup and restore
  the test process before cleanup, so test cgroups can be removed.

- Add TestProbe_ExplicitLeafHintNotStripped to verify explicit
  absolute hints ending in "leaf" are passed through unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace noopResourceLimiter with cgroupResourceLimiter that reports
Available() = hasCgroups, aligning with the CanLimit* capability flags.
Apply() returns a clear error directing callers to CgroupManager.
SupportedLimits() reports CPU/Memory/ProcessCount when cgroups exist.

Rename leaf sub-cgroup from "leaf" to "agentsh.leaf" to eliminate any
risk of collision with system cgroup names. Update normalization,
tryLeafMove, all tests, and doc comments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Set CanLimitNetworkBW=false — network bandwidth limiting uses tc/qdisc,
not cgroup v2, so it was never accurately reported as available.

Simplify ResourceLimiter back to noopResourceLimiter with
Available()=false. The CanLimit* flags describe OS-level capabilities
(cgroup v2 support); Resources().Available() describes whether the
platform interface implements limiting (it doesn't — CgroupManager does).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace noopResourceLimiter with cgroupResourceLimiter that lazily
creates a CgroupManager and delegates Apply() calls. This resolves
the architectural tension where CanLimit* capabilities were true but
Resources().Available() returned false.

- Available() returns true when cgroup v2 is present
- Apply() lazily initializes CgroupManager, maps ResourceConfig to
  CgroupV2Limits, returns a ResourceHandle wrapping CgroupV2
- AssignProcess() creates the cgroup on first call, writes to
  cgroup.procs on subsequent calls
- Fix CanLimitDiskIO to false (CgroupV2Limits has no disk IO fields)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ed Apply

- Cache cgroupResourceLimiter on Platform struct (singleton pattern
  matching Filesystem/Network/Sandbox accessors)
- Set h.cg = nil after Release() so double-release is idempotent
- Reject unsupported disk IO and network BW limits in Apply()
  instead of silently ignoring them (fail-closed)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a prior probe leaf-moved the process into <own>/agentsh.leaf and a
second probe uses a relative ownHint, the parent of the leaf path IS the
resolved own cgroup. Re-appending the relative hint would produce
.../sessions/sessions. Now use the parent directly when leaf residency
is detected in the relative-hint branch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Only set h.created=true after successful mgr.Apply, so retries
  after a failed first attach actually re-attempt creation
- Read memory.current and pids.current from the cgroup filesystem
  in Stats() instead of returning empty values

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Restore io.stat parsing in Stats() for disk IO telemetry parity
- Restore CanLimitDiskIO=hasCgroups and dynamically check io
  controller in SupportedLimits() via probe result
- Remove hard error on disk IO limits in Apply() — silently ignore
  like the prior implementation (CgroupV2Limits has no IO fields)
- Detect leaf residency for absolute ownHint via CurrentCgroupDir()
  for accurate LeafMoved telemetry

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- CanLimitDiskIO=false since io.max is not written (io.stat is still
  read for telemetry in Stats())
- SupportedLimits returns nil when probe mode is ModeUnavailable,
  preventing callers from seeing support for unenforced limits

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… Apply

- Fix relative ownHint resolution after a prior hint-less probe did the
  leaf-move: check if the parent already ends with the relative hint
  before deciding whether to append or skip
- Reject unsupported limits (DiskIO, NetworkBW, CPUAffinity) in Apply()
  instead of silently ignoring — fail-closed, consistent with
  CanLimitDiskIO=false

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Always append relative ownHint after stripping leaf suffix. If the
  result doubles (same hint used by prior probe), the probe gracefully
  falls through to top-level mode. Eliminates fragile basename matching.
- Make SupportedLimits() side-effect free by returning a static list
  based on DetectCgroupV2() instead of running the full probe via
  ensureManager().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add released flag to cgroupResourceHandle. AssignProcess returns an
explicit error after Release() instead of silently succeeding with
no cgroup attachment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Track active handles by name; reject duplicate Apply for the same
  config.Name to prevent cgroup interference between handles
- Only mark handle released after successful Close(); preserve state
  for retry on transient failures
- Remove handle from tracking map on release
- Document cpu.stat sampling limitation (CPUPercent requires time
  interval, same as prior implementation)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add 7 tests for cgroupResourceLimiter: Available, SupportedLimits,
  reject unsupported limits, duplicate handle name, AssignProcess
  after Release, idempotent Release, Stats before AssignProcess
- Surface cgroup restore/remove errors in eBPF integration test
  cleanup with t.Errorf instead of silently ignoring

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@erans erans force-pushed the feat/cgroup-leaf-move-cleanup branch from 6ab1064 to 493be17 Compare April 10, 2026 23:13
@erans erans merged commit 9798e81 into main Apr 10, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant