fallback for libdcgm.so for backward compatibility#81
fallback for libdcgm.so for backward compatibility#81valner wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
datacenter-gpu-manager v4 provides file libdcgm.so.4, but older version - libdcgm.so To support backward compatibility we should fallback to libdcgm.so if libdcgm.so.4 isn't available
|
@nvvfedorov could you please take a look? |
|
The main branch of the Go-DCGM bindings are only intended to support the current version of DCGM (4.x currently). Support for older versions of DCGM is not built into the bindings and older versions are not tested. |
This makes the changes proposed in NVIDIA/go-dcgm#81 Signed-off-by: Andre Fredette <afredette@redhat.com>
anfredette
left a comment
There was a problem hiding this comment.
I support merging this pr.
I tried using dcgm from a go program via go-dcgm on Fedora 40 and then with Ubuntu 24.04.2 LTS. I followed the directions here: https://github.com/NVIDIA/DCGM?tab=readme-ov-file#quickstart to install dcgm. In both cases, v3.3.9 of libdcgm.so was installed, so it didn’t work.
So, I used a fork of go-dcgm with this pr applied and it worked.
F40 is a little old right now, so possibly F42 has v4 of libdcgm, but Ubuntu 24.04.2 is the latest LTS, and it’s what you get by default on AWS for Ubuntu. So I think it’s worth making this work with v3 at least.
datacenter-gpu-manager v4 provides file libdcgm.so.4, but older version - libdcgm.so
To support backward compatibility with datacenter-gpu-manager 3 we should fallback to libdcgm.so if libdcgm.so.4 isn't available