Skip to content

sandbox create --from Dockerfile fails uploading image tar into gateway container on macOS #674

@lucasvw

Description

@lucasvw

Agent Diagnostic

  • Loaded and followed the upstream openshell-cli, debug-openshell-cluster, and create-github-issue skill instructions from the OpenShell repo.
  • Reproduced the failure from a local checkout using uv run openshell sandbox create --name deepagent-sandbox-repro --from ./Dockerfile --keep -vv.
  • Verified the gateway is otherwise healthy:
    • uv run openshell status => Connected to https://127.0.0.1:8080, version 0.0.16
    • uv run openshell doctor exec -- kubectl get --raw=/readyz => ok
    • uv run openshell doctor exec -- kubectl get pods -A -o wide => OpenShell pod and core cluster pods are running
    • uv run openshell doctor exec -- kubectl -n openshell get statefulset/openshell -o wide => 1/1 ready
  • The failure is isolated to the BYOC Dockerfile image handoff path, after the image is built and exported but before sandbox creation completes.
  • Verbose logs show the client exporting a 2782 MiB image tar, then attempting Docker archive upload to the gateway container at /containers/openshell-cluster-openshell/archive?path=%2Ftmp, which fails with:
    • failed to upload image tar into container
    • Error in the hyper legacy client: client error (SendRequest)
    • error writing a body to connection
    • Invalid argument (os error 22)
  • I also observed non-fatal cluster pressure signals that may be relevant but do not appear to be the primary root cause:
    • gateway container filesystem: 59G total, 44G used, 12G available
    • gateway memory: 7.7G total, 1.3G free, swap nearly exhausted
    • recent K8s events included ImageGCFailed / FreeDiskSpaceFailed on the node image filesystem at 85% usage
  • I could not resolve the issue from the CLI because the failure occurs inside the client->gateway Docker archive streaming step.

Description

Actual behavior: openshell sandbox create --from ./Dockerfile --keep successfully builds the custom image, exports it, and then fails while uploading the tar into the running gateway container. Sandbox creation aborts before a sandbox is created.
Expected behavior: OpenShell should import the built image into the gateway/cluster successfully and then proceed with sandbox creation.

Reproduction Steps

  1. Use macOS with Docker Desktop running and OpenShell 0.0.16.
  2. Create a Dockerfile based on ghcr.io/nvidia/openshell-community/sandboxes/base:latest, for example:
    • USER root
    • RUN pip install --no-cache-dir --upgrade pip && pip install --no-cache-dir matplotlib pandas seaborn
    • USER sandbox
    • ENTRYPOINT ["/bin/bash"]
  3. Run uv run openshell sandbox create --name deepagent-sandbox-repro --from ./Dockerfile --keep -vv.
  4. Wait for the image build/export to finish.
  5. Observe failure during image upload into the gateway container.

Environment

  • OS: macOS 14.5 (darwin 23.5.0)
  • Docker: 27.5.1 / 27.5.1
  • OpenShell: 0.0.16
  • Gateway endpoint: https://127.0.0.1:8080

Logs

Building image openshell/sandbox-from:1774865955 from /Users/user/Workspace/openshell-deepagent/Dockerfile
Built image openshell/sandbox-from:1774865955
Pushing image openshell/sandbox-from:1774865955 into gateway "openshell"
[progress] Exported 2782 MiB
DEBUG bollard::docker: unix://.../containers/openshell-cluster-openshell/archive?path=%2Ftmp
Error:   × failed to upload image tar into container
  ├─▶ Error in the hyper legacy client: client error (SendRequest)
  ├─▶ client error (SendRequest)
  ├─▶ error writing a body to connection
  ╰─▶ Invalid argument (os error 22)

Agent-First Checklist

  • I pointed my agent at the repo and had it investigate this issue
  • I loaded relevant skills (e.g., debug-openshell-cluster, debug-inference, openshell-cli)
  • My agent could not resolve this — the diagnostic above explains why

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions