Skip to content

705 fix gres implementation#710

Merged
jkrue merged 6 commits intodevfrom
705-fixrollback-gres-implementation
Feb 19, 2026
Merged

705 fix gres implementation#710
jkrue merged 6 commits intodevfrom
705-fixrollback-gres-implementation

Conversation

@XaverStiensmeier
Copy link
Contributor

This PR adds the proper driver implementation necessary to allow for nvidia graphic cards to run without issue. This will not work with nodes that use other graphic cards.

Note that this implementation assumes that nvidia device files are numbered 0-N. This assumption is necessary to allow for proper configuring slurm nodes with one or more graphic cards.

This PR also limits - as discussed - the allowed node images to 24.04 in anticipation of the Slurm update.

A new configuration schema is now supported essentially splitting master and other configs into separate elements in the yaml. This allows for a more pydantic format, but is not required nor intended to be adopted yet. In the next big version change, this could be considered.

With documentation/markdown/features/gres.md a short documentation is given informing GPU node users how BiBiGrid installs drivers to allow them to provide additional playbooks if they need more setup.

@jkrue jkrue merged commit a238fa1 into dev Feb 19, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants