-
Notifications
You must be signed in to change notification settings - Fork 447
Closed
Description
Hello,
I am kinda in a rabbit hole:
-
DCGM_FI_DEV_GPU_UTILis not supported for MIG devices
DCGM_FI_DEV_GPU_UTIL with MIG devices DCGM#80 (comment) -
DCGM_FI_PROF_SM_OCCUPANCYcould be a substitute, but it is disabled by default in
kubectl exec -it nvidia-dcgm-exporter-rh46x -- cat /etc/dcgm-exporter/dcp-metrics-included.csv | less -
To enable
DCGM_FI_PROF_*I found this issue, but the refferred piece of documentation is gone:
GPU-operator doesn't allow to specify a volume to mount metrics file for nvidia-dcgm-exporter #275 (comment)
Anybody managed to monitor MIG devices memory utilization?
Anybody managed to configure custom metrics for dgcm-exporter?
Thank you.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels