Skip to content

Conversation

@EdSwarthout
Copy link
Contributor

Allow any module suffix.

Commit 1f5ce73 ("Fix trigger of CDI refresh service") did not allow for compressed modules with suffixes such as .zst or .gz.

Also fix systemd warning:

/etc/systemd/system/nvidia-cdi-refresh.service:26: Ignoring unknown escape sequences: "/(nvidia|nvidia-current).ko[:]"

@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 5, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Lioli7k
Copy link

Lioli7k commented Dec 6, 2025

Pattern in the patch looks sketchy, since it would gladly accept something like updates/nvidia.ko.: or updates/nvidia.ko.not!an,extension:

Here are some alternative pattern suggestions:

'/(nvidia|nvidia-current)\\.ko(\\.(zst|xz|gz|bz2))?[:]' # Be really specific
'/(nvidia|nvidia-current)\\.ko(\\.[[:alnum:]]+)?[:]'    # Allow any alphanumeric extension
'/(nvidia|nvidia-current)\\.ko'                         # Ignore everything after module name

UPD: Double backslashes in the pattern are needed due systemd trying to treat single backslash as an escape sequence.

@elezar elezar added this to the v1.18.2 milestone Dec 8, 2025
Environment=NVIDIA_CTK_CDI_OUTPUT_FILE_PATH=/var/run/cdi/nvidia.yaml
EnvironmentFile=-/etc/nvidia-container-toolkit/nvidia-cdi-refresh.env
ExecCondition=/usr/bin/grep -qE '/(nvidia|nvidia-current)\.ko[:]' /lib/modules/%v/modules.dep
ExecCondition=/usr/bin/grep -qE '/(nvidia|nvidia-current)\\.ko(\\..*)?[:]' /lib/modules/%v/modules.dep
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @Lioli7k points out, would it be simpler to remove the [:] instead? Although this was added in #1409 it may be too restrictive.

Suggested change
ExecCondition=/usr/bin/grep -qE '/(nvidia|nvidia-current)\\.ko(\\..*)?[:]' /lib/modules/%v/modules.dep
ExecCondition=/usr/bin/grep -qE '/(nvidia|nvidia-current)\\.ko' /lib/modules/%v/modules.dep

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current patch proposal will match lines in /lib/modules/%v/modules.dep that look like this:

kernel/drivers/video/nvidia.ko:
updates/dkms/nvidia-current.ko:
extra/nvidia.ko.gz: (Compressed module)
weak-updates/nvidia.ko.xz: (Different compression)

Reversing #1409 will also work, as grep will act on the .ko even if there is more after it .ko.eg, so I am ok with both options

@elezar elezar added the bug Issue/PR to expose/discuss/fix a bug label Dec 8, 2025
@elezar
Copy link
Member

elezar commented Dec 8, 2025

/cherry-pick release-1.18

Commit 1f5ce73 ("Fix trigger of CDI refresh service") did not allow
for compressed modules with suffixes such as .zst or .gz. This change
relaxes the matching to not require that the kernel module (including
extension) be followed by a colon. This allows compressed kernel modules
to also be matched.

Also fix systemd warning:

/etc/systemd/system/nvidia-cdi-refresh.service:26:
Ignoring unknown escape sequences: "/(nvidia|nvidia-current)\.ko[:]"

Signed-off-by: Ed Swarthout <Ed.Swarthout@gmail.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Copy link
Collaborator

@ArangoGutierrez ArangoGutierrez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, properly reverses #1409

Leaving approval to @elezar

Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @EdSwarthout. I have rebased off main. This looks good now.

@elezar
Copy link
Member

elezar commented Dec 8, 2025

LGTM, properly reverses #1409

@ArangoGutierrez this doesn't reverse #1409. It only removes the check for a trailing :. The addition of matching nvidia-current.ko in addition to nvidia.ko is still valid.

@coveralls
Copy link

Pull Request Test Coverage Report for Build 20022696837

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 37.657%

Totals Coverage Status
Change from base Build 19959511148: 0.0%
Covered Lines: 5179
Relevant Lines: 13753

💛 - Coveralls

@elezar elezar merged commit 03cd9dd into NVIDIA:main Dec 8, 2025
6 checks passed
@github-actions
Copy link

github-actions bot commented Dec 8, 2025

🤖 Backport PR created for release-1.18: #1520

@EdSwarthout
Copy link
Contributor Author

Pattern in the patch looks sketchy, since it would gladly accept something like updates/nvidia.ko.: or updates/nvidia.ko.not!an,extension:

Here are some alternative pattern suggestions:

'/(nvidia|nvidia-current)\\.ko(\\.(zst|xz|gz|bz2))?[:]' # Be really specific

I have not seen a specific list of compression methods and this is not future proof.

'/(nvidia|nvidia-current)\\.ko(\\.[[:alnum:]]+)?[:]'    # Allow any alphanumeric extension
'/(nvidia|nvidia-current)\\.ko'                         # Ignore everything after module name

Removing the colon check works, but it is not clear why it was added in #1409.

Why does this check need to be so specific? What is wrong with running the refresh service if nvidia is anywhere in the modules.dep?

What about the day when the open source driver is in the kernel and can be builtin and not a module?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Issue/PR to expose/discuss/fix a bug cherry-pick/release-1.18

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants