-
Notifications
You must be signed in to change notification settings - Fork 3
License
trixirt/rocm-distro-containers
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
How to use containers to do some builds --- Cheat sheet Build: docker build -t <os> . Run: docker run --volume <host-path>:<container-path> -it --rm -p 8080:8080 --cpus=<num> <os> --volume is option -p is optional --cpus is optional but useful. --- Setup the host Some links for different distros Fedora https://developer.fedoraproject.org/tools/docker/docker-installation.html OpenSUSE https://www.suse.com/c/rancher_blog/introduction-to-using-docker/ Ubuntu https://docs.docker.com/engine/install/ubuntu/ # apt-get install docker.io --- Trouble shooting ---- Using /dev/kfd [root@3ff4ca99a474 test]# rocminfo ROCk module is loaded Unable to open /dev/kfd read-write: No such file or directory Failed to get user name to check for video group membership Likely need to pass the --device /dev/kfd --device /dev/dri ---- Permission for using chroots debian/ubuntu : root@c4384c2d0119:/# schroot -c <your chroot> E: 10mount: mount: /run/schroot/mount/<your chroot ...>: permission denied. E: 10mount: dmesg(1) may have more information after failed mount system call. E: <your chroot ...>: Chroot setup failed: stage=setup-start Likely need to pass --privileged to the docker run command --- Starting distro images SUSE https://en.opensuse.org/Docker https://hub.docker.com/r/opensuse/tumbleweed cs10 docker pull quay.io/centos/centos:stream10 --- Examples of uses 1. Test rocfft on OpenSUSE Go to the OpenSUSE check dir for librocfft0 > cd opensuse/tumbleweed/librocfft0/check Build the container as root # docker build -t test . On a machine with a AMD GPU, run the container # docker run --device /dev/kfd --device /dev/dri -it --rm test 2. Creating a new test Use the source! Here is how the librocfft0 check docker comparse to the build docker. > cd opensuse/tumbleweed/librocfft0/ $ diff -u Dockerfile check/Dockerfile --- Dockerfile 2025-06-04 13:47:33.596666381 -0700 +++ check/Dockerfile 2025-06-04 14:28:38.120159800 -0700 @@ -11,6 +11,17 @@ RUN zypper ar -G -f https://download.opensuse.org/repositories/science:/GPU:/ROCm/openSUSE_Tumbleweed/ rocm RUN zypper -n si librocfft0 RUN zypper -n si -d librocfft0 -RUN rpmbuild -ba ~/rpmbuild/SPECS/*.spec -CMD ["bash"] \ No newline at end of file +RUN zypper -n install \ + rocrand-devel \ + fftw-devel \ + boost-devel \ + hipcc-libomp-devel \ + hiprand-devel \ + gtest + +RUN rpmbuild --with test -ba ~/rpmbuild/SPECS/*.spec + +RUN rpm -ihv --nodeps ~/rpmbuild/RPMS/x86_64/* + +CMD ["rocfft-test"] \ No newline at end of file Same diff, now annotated $ diff -u Dockerfile check/Dockerfile --- Dockerfile 2025-06-04 13:47:33.596666381 -0700 +++ check/Dockerfile 2025-06-04 14:28:38.120159800 -0700 @@ -11,6 +11,17 @@ RUN zypper ar -G -f https://download.opensuse.org/repositories/science:/GPU:/ROCm/openSUSE_Tumbleweed/ rocm RUN zypper -n si librocfft0 RUN zypper -n si -d librocfft0 -RUN rpmbuild -ba ~/rpmbuild/SPECS/*.spec +RUN rpmbuild --with test -ba ~/rpmbuild/SPECS/*.spec The rpm is built with an option '--with test', in the specfile this is this section %bcond_with test %if %{with test} # Disable rpatch checks for a local build %global __brp_check_rpaths %{nil} %global build_test ON %else %global build_test OFF %endif Look for instances of %if %{with test}, in the specfile. \ No newline at end of file +RUN zypper -n install \ + rocrand-devel \ + fftw-devel \ + boost-devel \ + hipcc-libomp-devel \ + hiprand-devel \ + gtest + Building for testing requires other dependencies to be installed. This is the part that will change dependent on the package. You can find the list by looking in the specfile, like for rocfft.spec %if %{with test} BuildRequires: rocrand-devel BuildRequires: fftw-devel BuildRequires: boost-devel BuildRequires: hipcc-libomp-devel BuildRequires: hiprand-devel %if 0%{?suse_version} BuildRequires: gtest %else BuildRequires: gtest-devel %endif Note, because this is a suse test, we install gtest over gtest-devel, like we do for fedora or rhel. +RUN rpm -ihv --nodeps ~/rpmbuild/RPMS/x86_64/* This installs all the rpms that were just built, including the test rpms. -CMD ["bash"] +CMD ["rocfft-test"] Changing the run command from bash to the just installed rocfft-test Apply this to creating a test for opensuse rocblas. $ cd opensuse/tumbleweed/librocblas4 $ mkdir check $ cp Dockerfile check/ Review the rocblas.spec file for test dependencies. From these lines in the spec file %if %{with test} BuildRequires: blas-devel BuildRequires: libomp-devel BuildRequires: python3dist(pyyaml) BuildRequires: rocminfo BuildRequires: rocm-smi-devel BuildRequires: roctracer-devel %if 0%{?suse_version} BuildRequires: cblas-devel BuildRequires: gcc-fortran BuildRequires: gtest %else BuildRequires: gtest-devel %endif %endif Add this line to install the test dependencies RUN zypper -n install \ blas-devel \ libomp-devel \ python311-PyYAML \ python312-PyYAML \ python313-PyYAML \ rocminfo \ rocm-smi-devel \ roctracer-devel \ cblas-devel \ gcc-fortran \ gtest \ The tricky part is translating this BuildRequires: python3dist(pyyaml) Do this > zypper search pyyaml S | Name | Summary | Type ---+--------------------------+-----------------------------------------------------------------------+-------- i | python311-PyYAML | YAML parser and emitter for Python | package | python311-pyyaml_env_tag | A custom YAML tag for referencing environment variables in YAML files | package | python311-types-PyYAML | Typing stubs for PyYAML | package i | python312-PyYAML | YAML parser and emitter for Python | package | python312-pyyaml_env_tag | A custom YAML tag for referencing environment variables in YAML files | package | python312-types-PyYAML | Typing stubs for PyYAML | package i | python313-PyYAML | YAML parser and emitter for Python | package | python313-pyyaml_env_tag | A custom YAML tag for referencing environment variables in YAML files | package | python313-types-PyYAML | Typing stubs for PyYAML | package To find the likely packages to install. Now build the tests, change RUN rpmbuild -ba ~/rpmbuild/SPECS/*.spec to RUN rpmbuild -ba --with test ~/rpmbuild/SPECS/*.spec Then install the rpms that where just built, add this line RUN rpm -ihv --nodeps ~/rpmbuild/RPMS/x86_64/* Finally, change the CMD at the bottom to run the tests CMD ["rocblas-test"] 3. Picking which GPU to test -e HIP_VISIBLE_DEVICES=0 When the system has multiple GPU's, you may face a problem like Query device success: there are 2 devices ------------------------------------------------------------------------------- Device ID 0 : AMD Radeon Pro W7900 gfx1100 with 48.3 GB memory, max. SCLK 1760 MHz, max. MCLK 1124 MHz, memoryBusWidth 48 Bytes, compute capability 11.0 maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 32 ------------------------------------------------------------------------------- Device ID 1 : AMD Radeon Graphics gfx1036 with 67.1 GB memory, max. SCLK 2200 MHz, max. MCLK 1800 MHz, memoryBusWidth 16 Bytes, compute capability 10.3 maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 32 ------------------------------------------------------------------------------- info: parsing of test data may take a couple minutes before any test output appears... Note: Google Test filter = -:*stress* [==========] Running 1254647 tests from 210 test suites. [----------] Global test environment set-up. [----------] 1 test from _/multiheaded rocBLAS error: Cannot read /lib64/rocblas/library/TensileLibrary.yaml: No such file or directory for GPU arch : gfx1036 Read the HIP docs ex/ https://rocm.docs.amd.com/projects/HIP/en/docs-6.0.0/how_to_guides/debugging.html Making Device visible For system with multiple devices, it’s possible to make only certain device(s) visible to HIP via setting environment variable, HIP_VISIBLE_DEVICES(or CUDA_VISIBLE_DEVICES on Nvidia platform), only devices whose index is present in the sequence are visible to HIP. For example, HIP_VISIBLE_DEVICES=0,1 Pass -e HIP_VISIBLE_DEVICES=<num> in the docker run 4. Rebuild a just-built container Docker caches most of the steps, it is necessary to clean the cache and remove the image to rebuild the image. $ docker builder prune -a $ docker rmi b 5. Where are the SUSE ROCm packages ? The SUSE packages are produced by OBS, this is the project link https://build.opensuse.org/project/show/science:GPU:ROCm To find the repo location for the binaries, look at the 'Build Results' for the SUSE version closest to what you are building or testing. ex/ 15.6, click 15.6 and you are taken to this page https://build.opensuse.org/project/repository_state/science:GPU:ROCm/15.6 Look for the link 'Go to download repository' https://download.opensuse.org/repositories/science:/GPU:/ROCm/15.6/ It is necessary to add this repo so the ROCm packages can be found by zypper. An example of the use is https://github.com/trixirt/rocm-distro-containers/blob/main/suse/15.6/amdsmi/check/Dockerfile This line shows how to add the repo. RUN zypper ar -G -f https://download.opensuse.org/repositories/science:/GPU:/ROCm/15.6/ rocm For other versions of SUSE, replace https://* with the specific version that is required. 6. Using copr/side-tag Fedora/EPEL repos It is sometimes necessary to add third party or you own repos to the normal set. This is a common when you are using either a COPR or a side-tag. Then general form to add a repo is RUN dnf config-manager --add-repo <URL> With an example of adding a side-tag repo RUN dnf config-manager --add-repo https://kojipkgs.fedoraproject.org/repos/epel10.1-build-side-113908/latest/x86_64/ When later using the added repo you have an error like ... 38.45 Package rocm-runtime-6.4.1-1.el10_1.x86_64.rpm is not signed 38.45 Package rocm-runtime-devel-6.4.1-1.el10_1.x86_64.rpm is not signed 38.45 The downloaded packages were saved in cache until the next successful transaction. 38.45 You can remove cached packages by executing 'dnf clean packages'. 38.47 Error: GPG check FAILED Add the option --nogpgcheck to the dnf usage. An example is RUN dnf --nogpgcheck -y builddep rocfft/rocfft.spec 6. Debian/Ubuntu build order rocm-cmake rocm-smi-lib amdsmi rocm-llvm rocr-runtime rocminfo rocm-hipamd pkg-rocm-tools rocrand hiprand rocfft hipfft roctracer rocblas rocprim rocthrust hipcub rocsparse hipsparse rocalution hipblas-common rocsolver hipblas hipsolver 7. Fedora/RHEL build order python-tensile rocm-cmake rocm-compilersupport rocm-smi amdsmi rocm-runtime rocminfo half rocm-rpm-macros rocclr rocrand rocfft roctracer rocblas rocm-core hipify rccl rocprim rocthrust hipcub rocsparse rocsolver rocalution hiprand hipfft hipblas-common hipblas hipsparse rocdecode rocjpeg hipsolver fplus frugally-deep miopen rocm-omp rocwmma hipblaslt hipsparselt rocm-examples hipfort rocm-rpp mivisionx rocal
About
No description, website, or topics provided.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published