fix: add missing modprobe params for cosim amdgpu driver init#10
fix: add missing modprobe params for cosim amdgpu driver init#10
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b98db1705c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
The cosim environment requires ppfeaturemask=0 dpm=0 audio=0 in addition to ip_block_mask=0x67 to prevent the ROCm 7.0 driver from accessing unmodeled power management and audio registers in gem5. Without these parameters, modprobe fails with -EINVAL (-22) because the driver attempts PowerPlay/DPM initialization against registers that gem5 does not handle. Update all modprobe commands and documentation across scripts, docs, and CLAUDE.md to use the complete parameter set. Also fix cosim_guest_setup.sh to stop delegating to load_amdgpu.sh (which is for standalone gem5 and uses ip_block_mask=0x6f with PSP enabled). Fixes #9 Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
b98db17 to
3ad8e90
Compare
…_setup The kernel cmdline passes modprobe.blacklist=amdgpu which creates /run/modprobe.d/ entries causing modprobe to silently exit 0 without loading the module. Clear the runtime blacklist before probing. Same fix was already applied to cosim-gpu-setup.sh but was missing from cosim_guest_setup.sh. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
- Convert AMDGPU_ARGS from string to bash array to satisfy SC2086 in both cosim_guest_setup.sh and cosim-gpu-setup.sh - Add rm blacklist step to cosim_guest_setup.sh (codex review) - Add tests/test_modprobe_params.sh: verifies all cosim scripts include required modprobe parameters (ppfeaturemask, dpm, etc.) - Add modprobe-params CI job to prevent regression of issue #9 Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 39b86f188a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Tested this fix, works for me. Thanks |
Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 94c8e9afbc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…clone - Rewrite test to check each AMDGPU_ARGS definition and inline modprobe/insmod line independently instead of merging into one blob - Remove submodules: true from ShellCheck job (not needed) - Only init gem5-resources submodule for modprobe-params job Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 89c1fe6ad6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
.gitmodules uses SSH URLs which fail on CI runners without deploy keys. Override with git config insteadOf to use HTTPS. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ec2e9d6da4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…_mask Broaden the regex to catch any modprobe/insmod amdgpu invocation so the test also fails if ip_block_mask itself is removed. Skip lines that use AMDGPU_ARGS variable (validated via the definition check). Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6164191899
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Substring matching allowed incorrect values like ppfeaturemask=0xff to pass (contains "ppfeaturemask=0"). Normalize delimiters and use word-boundary matching to require exact param=value tokens. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Summary
cosim-gpu-setup.shfrom delegating to/home/gem5/load_amdgpu.sh(designed for standalone gem5, usesip_block_mask=0x6fwith PSP enabled)ppfeaturemask=0 dpm=0 audio=0parameters to all cosim modprobe commands — without these, the ROCm 7.0 driver tries to init PowerPlay/DPM against unmodeled gem5 registers, returning-EINVALinsmodfallback for.ko.zstcompressed modules (Ubuntu 24.04)Test plan
cosim-gpu-setup.serviceloads amdgpu successfullyrocm-smishows device 0x74a0 androcminfoshows gfx942modprobe amdgpu ip_block_mask=0x67 ppfeaturemask=0 dpm=0 audio=0 ras_enable=0 discovery=2Fixes #9