[WIP] Support Intel GPU (DRM)#127
Conversation
- Add byte-unit dependency.
- Parse display units from template suffix, e.g., `{mem_used_MiB}`.
- Small optimizations.
Add `uom` dependency and support degree C, degree F, and Kelvin for temperature.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
- Replace `byte_unit` crate with `uom` for memory and PCIe units. - Change field-unit separator from `_` to `:` in templates. Note: This is a large merge refactors.
Added KB, MB, GB, Kib, Mib, Gib, Kb, Mb, Gb.
Remove the `to_string()`s and directly write displays to buffer.
Also renamed `DisplayValue` to `SimpleDisplay`.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
- Support configurable decimal places for memory, temperature, and power fields.
- Users can specify decimal places using a format like `{temperature:f.2}`.
- The intended precision is specified after the *dot*.
This reverts commit 352074f.
trim_trailing_zeros now takes a protected_len to avoid trimming earlier segments.
A test is added to cover that.
If precision of a field is not provided, display all available digits.
Sort the drm devices based on their child named `cardX`.
- Remove Instance struct. Now we directly call get_handle(). - Nvml and AmdGpuHandle contexts are now created and held inside NvidiaGpuStatus and AmdGpuStatus. - Print GPU device name in main().
📝 WalkthroughWalkthroughThis PR refactors GPU status into a trait-based design, adds DRM-based device discovery and Intel iGPU support, introduces unit-aware formatting with precision, updates NVIDIA and AMD backends to per-field getters, and adjusts CI/build deps for udev-related tooling. Changes
Sequence DiagramsequenceDiagram
participant Main as main.rs
participant DRM as DRM Scanner
participant Hwdb as udev Hwdb
participant GPU as GPU Backend<br/>(GpuStatus impl)
participant Handle as GpuHandle
participant Formatter as Formatter
Main->>DRM: scan_drm_devices()
DRM->>Hwdb: query model/vendor (Hwdb)
DRM-->>Main: Vec<DrmDevice>
Main->>Main: select GPU by index
Main->>GPU: instantiate backend (Nvidia/Amd/Intel)
Main-->>Handle: wrap backend in GpuHandle
loop per interval
Main->>Handle: update(procs)
Handle->>GPU: update(procs)
GPU-->>Handle: cache field values
Main->>Formatter: State::try_from_format(format_str)
Formatter->>Formatter: parse FormatSegments (field, unit, precision)
Main->>Handle: get_text/get_tooltip(state)
Handle->>GPU: get_u8_field/get_mem_field/get_temperature/get_power
GPU-->>Handle: field values or Unavailable
Handle->>Formatter: assemble using field values
Formatter-->>Main: rendered output
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This comment was marked as outdated.
This comment was marked as outdated.
- Add libc dependency. - Move code in old drm.rs to drm/device.rs. - Some infrastructure for drm clients.
- Introduce IntelGpuStatus and integrate into main flow. - Refactor ClientManager, EngineStats, EngineSample; improve FD scanning and manage device names. - Move NVIDIA process check.
- Take max of render and video engine utilization as GPU utilization. - Add U8Field::RenderUtilization and U8Field::VideoUtilization. - Remove GetFieldError::NotReady because it's never used.
There was a problem hiding this comment.
Actionable comments posted: 19
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
README.md (1)
9-9:⚠️ Potential issue | 🟡 MinorIntel GPU support is not mentioned in the compatibility description.
This PR adds Intel iGPU monitoring, but line 9 still reads "compatible with both NVIDIA and AMD cards." This should be updated to include Intel.
Proposed fix
-It is compatible with both NVIDIA and AMD cards +It is compatible with NVIDIA, AMD, and Intel cards
🤖 Fix all issues with AI agents
In `@README.md`:
- Around line 88-103: Add an "Intel" support column to the Available fields
table and update each row (for fields like `gpu_utilization`, `mem_used`,
`mem_total`, `mem_rw`, `mem_utilization`, `decoder_utilization`,
`encoder_utilization`, `temperature`, `power`, `p_state`, `p_level`,
`fan_speed`, `tx`, `rx`) to reflect Intel support; at minimum mark
`gpu_utilization` as supported (✅) and set unknown/unsupported fields to ❌ (or
leave blank) until their support is confirmed.
In `@src/amd.rs`:
- Around line 25-31: Replace direct index access to hw_monitors (currently using
&self.handle.hw_monitors[0]) with a safe lookup (e.g.
self.handle.hw_monitors.first().ok_or(/* appropriate amdgpu_sysfs::error::Error
*/)?) in fan_percentage, get_temperature, and get_power; if no monitor exists
return an appropriate amdgpu_sysfs::error::Error instead of panicking so each
function (fan_percentage, get_temperature, get_power) gracefully handles empty
hw_monitors.
In `@src/drm/client.rs`:
- Around line 112-119: The should_manage method can panic because it calls
target.file_name().unwrap(); update should_manage (matching FDTarget::Path and
using self.devnames) to handle the Option from Path::file_name() safely instead
of unwrapping. Replace the unwrap usage with a safe check (e.g., map_or or if
let Some(fname) = target.file_name()) that returns false when file_name() is
None, then compare fname to each entry in self.devnames; keep the FDTarget::Path
and devnames symbols to locate the change.
- Around line 22-47: In update_engines, avoid the unwraps when parsing fdinfo
lines for RENDER_ENGINE_KEY and VIDEO_ENGINE_KEY: replace the current
line.split_whitespace().nth(1).unwrap().parse().unwrap() with a safe parse flow
that checks nth(1) and parse::<u64>().ok(), skip the line (continue) on None or
parse failure, and only create EngineSample::new(value) and call
self.render_engine.update_utilization(...) or
self.video_engine.update_utilization(...) when parsing succeeds; ensure
is_render_updated/is_video_updated are set only after a successful parse so
malformed lines are ignored rather than causing a panic.
- Around line 159-165: The read_id function currently uses unwrap() twice and
will panic for malformed "drm-client-id" lines; change the chain to safely
handle missing tokens and parse errors by replacing the unwraps with fallible
combinators (e.g., use .split_whitespace().nth(1).and_then(|s|
s.parse::<u32>().ok()) or .filter_map(...) so a missing second token or parse
failure returns None instead of panicking), keeping the function signature and
behavior of returning Option<u32> intact and only returning Some(id) when a
valid u32 is present.
- Around line 128-142: In EngineStats::update_utilization, avoid unsigned
underflow and divide-by-zero by computing delta_used with saturating_sub (or
checked_sub and treating negative/None as 0) on sample.value and
last_sample.value, and skip or set utilization to None if delta_sample (computed
from sample.sample_finished_at.duration_since(...).as_nanos()) is 0; update
self.utilization only when delta_sample > 0 and delta_used > 0, then cast to f64
for the division. Use the symbols update_utilization, sample.value,
last_sample.value, delta_used, delta_sample, and utilization when applying the
fix.
- Around line 66-78: The call to client.update_engines().unwrap() in
drm::client::Client::update can panic if a process exits between
scan_process_fds and update_engines; replace the unwrap with proper error
handling in update (e.g., match or if let Err(e)) so IO errors from
update_engines() are logged and that client is skipped/left unchanged rather
than crashing the process. Locate the update method and the update_engines
method on the client struct, catch and log errors returned by update_engines(),
and ensure the loop continues even when update_engines() fails (do not propagate
the panic).
In `@src/drm/device.rs`:
- Around line 44-46: The error messages for hwdb query failures contain a typo
"exits" instead of "exists"; update the strings used in the two query_one calls
that produce those errors (the model_name assignment that calls
query_one(modalias, "ID_MODEL_FROM_DATABASE") and the vendor_name assignment
that calls query_one(modalias, "ID_VENDOR_FROM_DATABASE")) to replace "No
model/vendor name result exits in database" with "No model/vendor name result
exists in database".
- Around line 29-36: get_dri_card_index currently uses to_str().unwrap() and
parse().unwrap() which can panic on non-UTF-8 sysnames or numbers > u8::MAX;
change the implementation to be fully fallible by replacing unwraps with safe
conversions (use dev.sysname().to_str().ok() and then strip_prefix("card") and
parse::<u8>().ok()), e.g., iterate self.children and use find_map or filter_map
chaining to return None on invalid UTF-8 or parse failures so the function
returns an Option<u8> without panicking.
- Around line 93-107: PciId::from_device currently uses unwraps that will panic
on malformed PCI_ID; change to safely propagate None by replacing
to_str().unwrap() with to_str().ok()? and split_once(':').unwrap() with
split_once(':')? and handle the numeric parses by using u16::from_str_radix(...,
16).ok()? so vendor_str/device_str are obtained via split_once('?')? (i.e., use
? for Option-producing operations) and vendor_id/device_id use .ok()? before
constructing Some(Self { vendor_id, device_id }). This keeps the function
returning None for any invalid PCI_ID instead of panicking.
- Around line 17-26: DrmDevice::new currently calls
PciId::from_device(...).unwrap(), which will panic for non-PCI DRM devices;
change the constructor to propagate or handle the absence of a PCI id by either
(A) returning Result<DrmDevice, E> from DrmDevice::new and returning an
appropriate error when PciId::from_device returns None, or (B) change the struct
field pci_id: PciId to pci_id: Option<PciId> and store PciId::from_device(...)
directly (no unwrap), then update callers (notably scan_drm_devices) to filter
out or handle devices with None pci_id; update any code that assumed pci_id to
handle the Result or Option accordingly.
In `@src/gpu_status/fields.rs`:
- Line 98: Fix the typo in the doc comment that currently reads "Render engine
utilization in perccent." — update the comment text to "Render engine
utilization in percent." to correct the spelling for the field whose doc comment
contains that exact phrase.
In `@src/intel.rs`:
- Around line 45-57: The get_u8_field function currently multiplies a decimal
utilization (from compute_render_utilization and compute_video_utilization) by
100 and casts to u8, which can yield >100 (or saturate to 255); clamp the
decimal to the 0.0–1.0 range before scaling so the returned percentage is always
0–100: compute the decimal as you already do (using compute_render_utilization /
compute_video_utilization and the U8Field match), then replace the direct
scaling/cast with a clamped value (e.g., let pct = (decimal.clamp(0.0, 1.0) *
100.0).round() and cast pct to u8), ensuring the function get_u8_field always
returns a 0–100 u8 for GpuUtilization/RenderUtilization/VideoUtilization.
In `@src/main.rs`:
- Around line 24-53: get_handle currently constructs NvidiaGpuStatus via
NvidiaGpuStatus::new() which internally always selects NVML device index 0;
change the constructor usage and implementation so the DRM device's PCI bus ID
is passed through and used to look up the NVML device (use
nvml.device_by_pci_bus_id()) instead of device_by_index(0). Specifically, modify
get_handle to extract the DRM device PCI bus id from the DrmDevice (for the
NVIDIA branch) and call a new NvidiaGpuStatus::with_pci(pci_bus_id: String) (or
add a parameter to NvidiaGpuStatus::new) and update src/nvidia.rs to resolve the
NVML device via nvml.device_by_pci_bus_id(pci_bus_id) so the selected --gpu DRM
index maps to the correct NVML device.
- Around line 25-29: The code currently calls
gpu.get_vendor_name(...).into_string().unwrap() which will panic if the OsString
is non-UTF-8; change this to convert safely (e.g., use to_string_lossy()) and
then lowercase to produce a String. Locate the vendor_name assignment (variable
vendor_name and the call gpu.get_vendor_name) and replace the
into_string().unwrap() usage with a lossless conversion such as
.to_string_lossy().into_owned() (or otherwise handle the Err from into_string())
before calling .to_lowercase() so non-UTF-8 vendor names do not panic.
- Around line 128-143: print_gpu currently unwraps nodes and OsStr conversions
and can panic if gpu.children is empty or sysnames are non-UTF-8; modify
print_gpu to collect child names safely (e.g., map each
dev.sysname().to_string_lossy().into_owned()), join them with commas, and handle
the empty case by printing something like "<none>" instead of calling
nodes.next().unwrap(); keep using gpu.get_model_name(hwdb)? for model lookup but
replace all .to_str().unwrap() with to_string_lossy() conversions so no panics
occur when sysnames are non-UTF-8.
- Around line 112-118: The loop currently early-continues when
procfs::process::all_processes() returns Err, causing a tight spin; modify the
loop around procfs::process::all_processes() (the Ok(procs) pattern) so that on
error you call std::thread::sleep for a short duration (e.g., a few hundred
milliseconds) before continuing, rather than immediately continuing; ensure you
keep the existing error handling for gpu_status_handle.data.update and only add
the sleep in the error branch that handles procfs failures to avoid burning CPU.
In `@src/nvidia.rs`:
- Around line 215-235: The function has_running_processes currently hardcodes
"/dev/nvidia0"; change its signature to accept the target device (e.g.,
device_path: &str or device_index: usize) and use that value when comparing
FDTarget::Path in the fd loop, so that the comparison uses the actual GPU device
(format "/dev/nvidia{index}" or a path derived from the GPU's bus ID supplied by
NvidiaGpuStatus::new()). Update callers (including NvidiaGpuStatus::new) to pass
the correct device path/index derived from the selected GPU instead of relying
on index 0, and keep the existing proc.fd() and FDTarget::Path checks intact.
- Around line 43-45: The device() helper currently unwraps NVML lookup and will
panic if the GPU is unavailable; change fn device(&self) -> Device<'_> to return
a Result<Device<'_>, GetFieldError> (or appropriate error type) and convert the
nvml.device_by_index(0).unwrap() into a fallible call that maps NVML errors to
GetFieldError::Unavailable; then update all callers (the field getters) to use
self.device()? so the error is propagated instead of panicking; ensure signature
changes and error mapping are applied to functions that call device() so they
return Result and use the ? operator.
🧹 Nitpick comments (9)
src/intel.rs (1)
45-54: Both utilizations are recomputed on everyget_u8_fieldcall, even if only one is needed.Lines 46-47 always compute both
render_utilizationandvideo_utilizationregardless of whichU8Fieldvariant is requested. This iterates the clients list twice per call. ForRenderUtilizationorVideoUtilization, one of the two sums is wasted.Proposed refactor: compute lazily
fn get_u8_field(&self, field: U8Field) -> Result<u8, GetFieldError> { let decimal = match field { - U8Field::GpuUtilization => render_utilization.max(video_utilization), - U8Field::RenderUtilization => render_utilization, - U8Field::VideoUtilization => video_utilization, + U8Field::GpuUtilization => { + self.compute_render_utilization().max(self.compute_video_utilization()) + } + U8Field::RenderUtilization => self.compute_render_utilization(), + U8Field::VideoUtilization => self.compute_video_utilization(), _ => return Err(GetFieldError::BrandUnsupported), };src/gpu_status/fields.rs (1)
112-123:MemFieldconflates memory quantities with PCIe throughput.
TxandRxare throughput rates, not memory amounts. Grouping them underMemFieldworks because they share theInformationUOM type, but the name is misleading. Consider renaming toInformationFieldor similar if this grows. Fine for now in a WIP PR.src/config/structs.rs (1)
93-97: Parse errors in format segments are silently treated as "unavailable", removing the line.On Line 96, if
Field::try_fromfails (e.g., user wrote{temperature:xyz}— an invalid unit),map_or(true, ...)silently drops that line instead of surfacing the parse error. This could mask typos in user-defined tooltip formats.Consider logging a warning here for
Errcases, similar to howparse()insrc/formatter/mod.rsprints a warning for unknown fields.Proposed change
let has_unavailable = re.captures_iter(line).any(|caps| { let format_segments = FormatSegments::from_caps_unchecked(&caps); - Field::try_from(format_segments).map_or(true, |f| handle.is_field_unavailable(f)) + match Field::try_from(format_segments) { + Ok(f) => handle.is_field_unavailable(f), + Err(e) => { + eprintln!("Warning: failed to parse field `{}`: {e}", format_segments.field); + true + } + } });src/amd.rs (1)
62-69: Unnecessaryeyre!allocation — error is immediately discarded.The
eyre!(format!(...))on Lines 65-68 creates an error that is immediately mapped away by.map_err(|_| GetFieldError::Unavailable). Simplify took_or(GetFieldError::Unavailable)directly.Proposed fix
let temp = temps .iter() .find(|t| t.0 == TEMP_SENSOR_NAME) - .ok_or(eyre!(format!( - "No \"{}\" temperature sensor found", - TEMP_SENSOR_NAME - ))) - .map_err(|_| GetFieldError::Unavailable)?; + .ok_or(GetFieldError::Unavailable)?; let temp = temp.1.current.ok_or(GetFieldError::Unavailable)?;src/gpu_status/mod.rs (3)
54-56: Consider restrictingdatafield visibility topub(crate).
GpuHandle.datais public, exposing the internalBox<dyn GpuStatus>to external consumers. Sincemain.rsaccesseshandle.data.update(procs),pub(crate)would be sufficient to limit exposure.Proposed change
pub struct GpuHandle { - pub data: Box<dyn GpuStatus>, + pub(crate) data: Box<dyn GpuStatus>, }
137-159:is_field_unavailablecalls the same getters thatwrite_fieldwill call later — double work.During tooltip rendering,
retain_lines_with_valuescallsis_field_unavailablefor each field, thenassemblecallswrite_fieldwhich calls the same getters again. For fields backed by NVML or sysfs calls, this doubles the I/O. Since this only happens once at startup (not in the hot loop), it's acceptable, but worth noting if this is ever used in the render loop.
161-171:compute_mem_usageis correct but lacks a defensive clamp.Line 170:
(ratio * 100.0).round() as u8— ifmem_usedever exceedsmem_total(e.g., shared/swap memory accounting), the result would overflowu8. A.clamp(0.0, 100.0)before the cast would be defensive.Proposed fix
let ratio: f32 = (mem_used / mem_total).into(); - Some((ratio * 100.0).round() as u8) + Some((ratio * 100.0).round().clamp(0.0, 100.0) as u8)src/formatter/mod.rs (1)
71-73:get_regex()recompiles the regex on every call.While it's only called a few times (not in the hot loop), regex compilation isn't free. Consider using
std::sync::LazyLock(stable since Rust 1.80) oronce_cellto compile once.Example using LazyLock
+use std::sync::LazyLock; + +static FORMAT_RE: LazyLock<Regex> = LazyLock::new(|| { + Regex::new(r"\{(\w+)(?::(\w+)(?:\.(\d+))?)?\}").unwrap() +}); + pub fn get_regex() -> Regex { - Regex::new(r"\{(\w+)(?::(\w+)(?:\.(\d+))?)?\}").unwrap() + FORMAT_RE.clone() }Or return
&'static Regexand update callers to use a reference.src/nvidia.rs (1)
48-78: Multipleself.device()calls per field query adds NVML overhead.Each field getter calls
self.device()which performsnvml.device_by_index(0). While NVML handle lookups are fast, this is called many times per update cycle (once per field in the format string). Consider caching theDeviceinupdate()if profiling shows this matters.
DrmDevice::new() now returns NotPciDeviceError if PciId::from_device returns None.
- NvidiaGpuStatus::new() now takes a bus_id instead of getting it from NVML index 0 device. - nvidia::has_running_processes now checks devnames, usually being `card*` and `renderD*`, instead of `nvidia0`. - main::print_gpu() now prints bus ID.
Use Nvml::device_by_pci_bus_id() instead of Nvml::device_by_index().
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/nvidia.rs`:
- Around line 232-238: The loop uses path.file_name().unwrap() which can panic
for paths like ".."; replace the unwrap with a safe Option check similar to
client.rs: test the file_name() via is_some_and (or map/and_then) and only
compare when present. Update the FDTarget::Path match in this loop to use
path.file_name().is_some_and(|fname| devnames.iter().any(|n| n == fname))
(referencing FDTarget::Path, path.file_name(), and devnames) so the function
returns true only when a file_name exists and matches.
🧹 Nitpick comments (1)
src/intel.rs (1)
45-57: Both utilization values are computed even when only one is needed.
compute_render_utilization()andcompute_video_utilization()are both called unconditionally on Lines 46-47, even forRenderUtilizationorVideoUtilizationwhich only use one. Consider computing lazily inside the match arms.Proposed refactor
fn get_u8_field(&self, field: U8Field) -> Result<u8, GetFieldError> { - let render_utilization = self.compute_render_utilization(); - let video_utilization = self.compute_video_utilization(); - let decimal = match field { - U8Field::GpuUtilization => render_utilization.max(video_utilization), - U8Field::RenderUtilization => render_utilization, - U8Field::VideoUtilization => video_utilization, + U8Field::GpuUtilization => { + self.compute_render_utilization().max(self.compute_video_utilization()) + } + U8Field::RenderUtilization => self.compute_render_utilization(), + U8Field::VideoUtilization => self.compute_video_utilization(), _ => return Err(GetFieldError::BrandUnsupported), };
Overview
Support monitoring Intel GPU render and video engine utilization, by processing DRM client file descriptors.
Utilization Computation
The engine utilization is calculated using the formula
The overall GPU utilization is defined as the maximum value between these two engines. (#101 (comment))
Refactoring
With the extension of supported GPU brands from two (NVIDIA & AMD) to three, I have introduced a
GpuStatustrait. This replaces the previous universal struct, which was becoming sparse withNonevalues—particularly since we can only get utilization info for Intel GPU.Credits
The computation logic is heavily based on qmassa.
Closes #101.
Tracker
--gpu <index>. If not specified, it defaults to index 0. The GPU index is derived from/dev/dri/cardN. (Closes [Feature Request] Add Function to Display Single or Multiple GPU Data #106)Summary by CodeRabbit
New Features
Configuration