Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions src/benchflow/trial.py
Original file line number Diff line number Diff line change
Expand Up @@ -545,9 +545,15 @@ async def soft_verify(self) -> tuple[dict | None, str | None, str | None]:
from benchflow._sandbox import _build_cleanup_cmd, _read_hardening_config

self._trial_paths.verifier_dir.mkdir(parents=True, exist_ok=True)
# Clean verifier output dir — chmod 777 so non-root verifier processes can write
# Clean verifier output dir — chmod 777 so non-root verifier processes can write.
# Also ensure /app exists: VERIFIER_ENV pins PYTEST_ADDOPTS=--rootdir=/app for
# test-node-ID anchoring, and pytest aborts with "Directory '/app' not found"
# when the task's Dockerfile WORKDIRs elsewhere (e.g. /root). Tasks that DO
# populate /app are unaffected — `mkdir -p` is a no-op when the directory
# already exists, and we don't chmod it (any task content stays root-owned).
await self._env.exec(
"rm -rf /logs/verifier && mkdir -p /logs/verifier && chmod 777 /logs/verifier",
"rm -rf /logs/verifier && mkdir -p /logs/verifier /app && "
"chmod 777 /logs/verifier",
user="root", timeout_sec=10,
Comment on lines 554 to 557
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 mkdir -p /app fix applied to soft_verify but not to the full verify path's _CLEAR_VERIFIER_DIR_CMD

The PR adds mkdir -p /app to soft_verify() to prevent pytest from aborting with "Directory '/app' not found" when VERIFIER_ENV pins PYTEST_ADDOPTS=--rootdir=/app and the task's Dockerfile WORKDIRs elsewhere (e.g. /root). However, the same issue exists in the full verify() path: sdk._verify() calls harden_before_verify() at src/benchflow/sdk.py:403, which executes _CLEAR_VERIFIER_DIR_CMD at src/benchflow/_sandbox.py:717 — and that constant at src/benchflow/_sandbox.py:593-595 does not include mkdir -p /app. Since harden_before_verify() also sets PYTEST_ADDOPTS with --rootdir=/app (via VERIFIER_ENV at src/benchflow/_sandbox.py:304-308), the same pytest abort will occur during the final verify() for any task whose Dockerfile doesn't create /app.

Prompt for agents
The mkdir -p /app fix in soft_verify() at src/benchflow/trial.py:555 addresses a real issue where pytest aborts with 'Directory /app not found' when VERIFIER_ENV sets --rootdir=/app but the task Dockerfile WORKDIRs elsewhere. However, the same fix needs to be applied to the full verify() path.

The constant _CLEAR_VERIFIER_DIR_CMD at src/benchflow/_sandbox.py:593-595 is used by harden_before_verify() at line 717, which is called from sdk._verify() at src/benchflow/sdk.py:403. This constant should also include 'mkdir -p /app' to ensure /app exists before the verifier runs pytest with --rootdir=/app.

Suggested fix: Update _CLEAR_VERIFIER_DIR_CMD in src/benchflow/_sandbox.py from:
  "rm -rf /logs/verifier && mkdir -p /logs/verifier && chmod 777 /logs/verifier"
to:
  "rm -rf /logs/verifier && mkdir -p /logs/verifier /app && chmod 777 /logs/verifier"
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

)
# Purge agent-injected conftest/sitecustomize/.pth without
Expand Down
Loading