-
Notifications
You must be signed in to change notification settings - Fork 43
Description
When results, err := dcgm.RunDiag(dcgm.DiagQuick, dcgm.GroupAllGPUs()) is called and we print the results, the following is the output
{Software:[{Status:pass TestName:presence of drivers on the denylist (e.g. nouveau) TestOutput:Allocated 83618558100 bytes (98.4%) ErrorCode:0 ErrorMessage:} {Status:pass TestName:presence of drivers on the denylist (e.g. nouveau) TestOutput:Allocated 83618558100 bytes (98.4%) ErrorCode:0 ErrorMessage:} {Status:pass TestName:presence of drivers on the denylist (e.g. nouveau) TestOutput:Allocated 83618558100 bytes (98.4%) ErrorCode:0 ErrorMessage:}]}
Should not the TestName be "software", "memory" and "pcie" the way it's displayed in dcgmi command. I also see an used function gpuTestName in diag.go which should be the ideal testname.
dcgmi diag -r 2
Successfully ran diagnostic for group. +---------------------------+------------------------------------------------+ | Diagnostic | Result | +===========================+================================================+ |----- Metadata ----------+------------------------------------------------| | DCGM Version | 4.4.1 | | Driver Version Detected | 580.76.05 | | GPU Device IDs Detected | 2330 | |----- Deployment --------+------------------------------------------------| | software | Pass | | | GPU0: Pass | +----- Hardware ----------+------------------------------------------------+ | memory | Pass | | | GPU0: Pass | +----- Integration -------+------------------------------------------------+ | pcie | Pass | | | GPU0: Pass | +---------------------------+------------------------------------------------+