Skip to content

vgpu 并发调度pod时,显存混乱 #60

@singeleaf

Description

@singeleaf

执行下面的命令,同时调度2个pod,一个分配24576M显存,一个分配600M显存,pod起来后进入容器使用nvidia-smi查看,发现两者的显存是反的,给容器ubuntu-container-24576分配了600M显存,给容器ubuntu-container-600分配了24576显存

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod-1v-24576-1
spec:
  schedulerName: volcano
  containers:
    - name: ubuntu-container-24576
      image: ubuntu:18.04
      command: ["bash", "-c", "sleep 86400"]
      resources:
        limits:
          volcano.sh/vgpu-number: 1 # requesting 1 vGPUs
          volcano.sh/vgpu-memory: 24576
---
apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod-1v-600-1
spec:
  schedulerName: volcano
  containers:
    - name: ubuntu-container-600
      image: ubuntu:18.04
      command: ["bash", "-c", "sleep 86400"]
      resources:
        limits:
          volcano.sh/vgpu-number: 1 # requesting 1 vGPUs
          volcano.sh/vgpu-memory: 600
EOF

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions