-
Notifications
You must be signed in to change notification settings - Fork 411
Open
Description
Agent Diagnostic
- Loaded and followed the upstream
openshell-cli,debug-openshell-cluster, andcreate-github-issueskill instructions from the OpenShell repo. - Reproduced the failure from a local checkout using
uv run openshell sandbox create --name deepagent-sandbox-repro --from ./Dockerfile --keep -vv. - Verified the gateway is otherwise healthy:
uv run openshell status=> Connected tohttps://127.0.0.1:8080, version0.0.16uv run openshell doctor exec -- kubectl get --raw=/readyz=>okuv run openshell doctor exec -- kubectl get pods -A -o wide=> OpenShell pod and core cluster pods are runninguv run openshell doctor exec -- kubectl -n openshell get statefulset/openshell -o wide=>1/1ready
- The failure is isolated to the BYOC Dockerfile image handoff path, after the image is built and exported but before sandbox creation completes.
- Verbose logs show the client exporting a 2782 MiB image tar, then attempting Docker archive upload to the gateway container at
/containers/openshell-cluster-openshell/archive?path=%2Ftmp, which fails with:failed to upload image tar into containerError in the hyper legacy client: client error (SendRequest)error writing a body to connectionInvalid argument (os error 22)
- I also observed non-fatal cluster pressure signals that may be relevant but do not appear to be the primary root cause:
- gateway container filesystem:
59Gtotal,44Gused,12Gavailable - gateway memory:
7.7Gtotal,1.3Gfree, swap nearly exhausted - recent K8s events included
ImageGCFailed/FreeDiskSpaceFailedon the node image filesystem at 85% usage
- gateway container filesystem:
- I could not resolve the issue from the CLI because the failure occurs inside the client->gateway Docker archive streaming step.
Description
Actual behavior: openshell sandbox create --from ./Dockerfile --keep successfully builds the custom image, exports it, and then fails while uploading the tar into the running gateway container. Sandbox creation aborts before a sandbox is created.
Expected behavior: OpenShell should import the built image into the gateway/cluster successfully and then proceed with sandbox creation.
Reproduction Steps
- Use macOS with Docker Desktop running and OpenShell
0.0.16. - Create a Dockerfile based on
ghcr.io/nvidia/openshell-community/sandboxes/base:latest, for example:USER rootRUN pip install --no-cache-dir --upgrade pip && pip install --no-cache-dir matplotlib pandas seabornUSER sandboxENTRYPOINT ["/bin/bash"]
- Run
uv run openshell sandbox create --name deepagent-sandbox-repro --from ./Dockerfile --keep -vv. - Wait for the image build/export to finish.
- Observe failure during image upload into the gateway container.
Environment
- OS: macOS 14.5 (darwin 23.5.0)
- Docker: 27.5.1 / 27.5.1
- OpenShell: 0.0.16
- Gateway endpoint: https://127.0.0.1:8080
Logs
Building image openshell/sandbox-from:1774865955 from /Users/user/Workspace/openshell-deepagent/Dockerfile
Built image openshell/sandbox-from:1774865955
Pushing image openshell/sandbox-from:1774865955 into gateway "openshell"
[progress] Exported 2782 MiB
DEBUG bollard::docker: unix://.../containers/openshell-cluster-openshell/archive?path=%2Ftmp
Error: × failed to upload image tar into container
├─▶ Error in the hyper legacy client: client error (SendRequest)
├─▶ client error (SendRequest)
├─▶ error writing a body to connection
╰─▶ Invalid argument (os error 22)Agent-First Checklist
- I pointed my agent at the repo and had it investigate this issue
- I loaded relevant skills (e.g.,
debug-openshell-cluster,debug-inference,openshell-cli) - My agent could not resolve this — the diagnostic above explains why
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels