Skip to content

Feat: Update GPU container to use python container as base instead of cuda container#76

Merged
fujikosu merged 4 commits intomainfrom
feat/simple-gpu-container
Mar 31, 2025
Merged

Feat: Update GPU container to use python container as base instead of cuda container#76
fujikosu merged 4 commits intomainfrom
feat/simple-gpu-container

Conversation

@fujikosu
Copy link
Copy Markdown
Member

Purpose

This PR updates base container of sample_pytorch_gpu_project from cuda container to pytorch container.
As discussed in this thread (https://discuss.pytorch.org/t/cannot-for-the-life-of-me-get-pytorch-and-cuda-to-install-work/197088/5?utm_source=chatgpt.com), when you install gpu version of pytorch wheel, it installs necessary cuda dependencies and even if local environment has another cuda installed, it won't be used. That means using cuda container as base doesn't have any benefits.

In addition to that, across all containers in this repo, UID will be switched back to 1000. Using 1010 was causing permission issues for linux as covered in this page

https://code.visualstudio.com/remote/advancedcontainers/add-nonroot-user

Because of this, your container user will either need to have the same UID or be in a group with the same GID. The actual name of the user / group does not matter. The first user on a machine typically gets a UID of 1000, so most containers use this as the ID of the user to try to avoid this problem.

Tests

New GPU container was tested on linux GPU machine

both nvidia-smi and python sample_main.py worked

image image

Does this introduce a breaking change?

  • Yes
  • No

Author pre-publish checklist

  • No PII in logs or output
  • Made corresponding changes to the documentation
  • All new packages used are included in requirements.txt
  • Functions use type hints, and there are no type hint errors

Pull Request Type

What kind of change does this Pull Request introduce?

  • Bugfix
  • Feature
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Documentation content changes
  • Experiment notebook

@fujikosu fujikosu requested a review from Copilot March 27, 2025 14:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the GPU container by switching its base from a CUDA container to a PyTorch container to better manage CUDA dependencies and adjusts the UID to 1000 to prevent permission issues.

  • Updated sample_main.py to print additional CUDA and cuDNN version info for debugging purposes.
  • Changed the target Python version from 3.9 to 3.11 in pyproject.toml for improved compatibility.

Reviewed Changes

Copilot reviewed 2 out of 5 changed files in this pull request and generated no comments.

File Description
src/sample_pytorch_gpu_project/sample_main.py Added print statements to output CUDA and cuDNN version details.
pyproject.toml Updated target-version from py39 to py311.
Files not reviewed (3)
  • notebooks/.devcontainer/Dockerfile: Language not supported
  • src/sample_cpu_project/.devcontainer/Dockerfile: Language not supported
  • src/sample_pytorch_gpu_project/.devcontainer/Dockerfile: Language not supported

@fujikosu fujikosu requested a review from bhavikm March 27, 2025 14:30
Comment thread src/sample_pytorch_gpu_project/.devcontainer/Dockerfile
@fujikosu fujikosu requested a review from bhavikm March 31, 2025 07:43
@fujikosu fujikosu merged commit 8da7506 into main Mar 31, 2025
3 checks passed
@fujikosu fujikosu deleted the feat/simple-gpu-container branch March 31, 2025 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants