Summary
When running tasks that have docker-compose.yaml on Daytona, BenchFlow fails with Process closed stdout (rc=None) while Harbor succeeds with the same environment. The agent is never installed and never starts.
Reproduction
Same task (gh-repo-analytics), same model (claude-opus-4-7), same Daytona:
| Method |
Result |
Tool Calls |
Error |
| BenchFlow + Docker (host) |
✅ reward=0.0 |
8 |
None |
| Harbor + Daytona DinD |
⚠️ reward=0.0 |
N/A |
NonZeroAgentExitCodeError (auth issue, but compose worked) |
| BenchFlow + Daytona DinD |
❌ |
0 |
Process closed stdout (rc=None) |
Harbor proves the DinD compose infrastructure works on Daytona. BenchFlow's ACP path fails at the agent install/exec step.
Root cause analysis
BenchFlow's DaytonaProcess.from_harbor_env() detects DinD mode and constructs docker compose exec commands to run inside the DinD container. However:
install_agent() calls env.exec() which goes through Harbor's DaytonaEnvironment._sandbox_exec() → _strategy.exec() → _DaytonaDinD.exec() → _compose_exec(["exec", ...])
- The compose exec command runs inside the DinD VM via SSH
- The agent install produces empty stdout — the install command output is lost
- After install,
connect_acp() creates a DaytonaProcess and tries to start the agent via docker compose exec ... claude-agent-acp
- The agent process immediately closes stdout →
Process closed stdout (rc=None)
Likely causes:
- The compose env vars (
_compose_env_vars()) may not include the subscription auth credentials needed by the agent
- The SSH → DinD → compose exec chain may lose the
--env-file or env var injection
- Node.js may not be installed inside the compose container (the DinD VM is
docker:28.3.3-dind Alpine, but the compose main container is built from the task's Dockerfile)
Evidence
BenchFlow DinD result:
install-stdout.txt: empty
claude_agent_acp.txt: empty
timing.json: environment_setup: 42.7s (compose up succeeded)
config.json: _BENCHFLOW_SUBSCRIPTION_AUTH: 1 (OAuth detected)
Environment
- BenchFlow v0.3.1 (dev-0.3 branch)
- Harbor (pip installed, used by BenchFlow)
- Daytona cloud
- Tasks affected: all 5 with docker-compose.yaml in SkillsBench (gh-repo-analytics, pedestrian-traffic-counting, pg-essay-to-audiobook, scheduling-email-assistant, react-performance-debugging)
Expected behavior
BenchFlow should install and run claude-agent-acp inside the DinD compose main container, same as Harbor installs and runs claude CLI.
Summary
When running tasks that have
docker-compose.yamlon Daytona, BenchFlow fails withProcess closed stdout (rc=None)while Harbor succeeds with the same environment. The agent is never installed and never starts.Reproduction
Same task (
gh-repo-analytics), same model (claude-opus-4-7), same Daytona:Harbor proves the DinD compose infrastructure works on Daytona. BenchFlow's ACP path fails at the agent install/exec step.
Root cause analysis
BenchFlow's
DaytonaProcess.from_harbor_env()detects DinD mode and constructsdocker compose execcommands to run inside the DinD container. However:install_agent()callsenv.exec()which goes through Harbor'sDaytonaEnvironment._sandbox_exec()→_strategy.exec()→_DaytonaDinD.exec()→_compose_exec(["exec", ...])connect_acp()creates aDaytonaProcessand tries to start the agent viadocker compose exec ... claude-agent-acpProcess closed stdout (rc=None)Likely causes:
_compose_env_vars()) may not include the subscription auth credentials needed by the agent--env-fileor env var injectiondocker:28.3.3-dindAlpine, but the compose main container is built from the task's Dockerfile)Evidence
BenchFlow DinD result:
install-stdout.txt: emptyclaude_agent_acp.txt: emptytiming.json:environment_setup: 42.7s(compose up succeeded)config.json:_BENCHFLOW_SUBSCRIPTION_AUTH: 1(OAuth detected)Environment
Expected behavior
BenchFlow should install and run
claude-agent-acpinside the DinD composemaincontainer, same as Harbor installs and runsclaudeCLI.