Skip to content

Conversation

@phambinhfin
Copy link
Contributor

This argument is only needed for NVIDIA/CUDA builds, not for ROCm builds.
Changed required=True to required=False to allow ROCm builds to proceed
without providing NVIDIA-specific data.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @phambinhfin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request streamlines the build process for GPU kernels and plugins by making the NVIDIA wheel versions data argument optional. This change is specifically aimed at improving compatibility and ease of use for ROCm builds, allowing them to complete successfully without needing irrelevant NVIDIA-specific data, thereby preventing potential build issues in non-NVIDIA environments.

Highlights

  • NVIDIA Version Data: The --nvidia_wheel_versions_data argument has been made optional in both build_gpu_kernels_wheel.py and build_gpu_plugin_wheel.py.
  • ROCm Build Compatibility: This modification enables ROCm builds to proceed without requiring NVIDIA-specific configuration, which is unnecessary for ROCm environments.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request makes the --nvidia_wheel_versions_data argument optional to support ROCm builds. While this change is correct for ROCm, it also makes the argument optional for CUDA builds, where it is required. This could lead to build failures for CUDA if the argument is accidentally omitted. I've added comments with suggestions to add a runtime check to ensure the argument is provided for CUDA builds, which would make the implementation more robust.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly makes the --nvidia_wheel_versions_data argument optional and adds a check to ensure it's provided for CUDA builds. The changes are applied consistently across both build_gpu_kernels_wheel.py and build_gpu_plugin_wheel.py. My feedback focuses on a minor code duplication issue introduced by these changes, with a suggestion to refactor the validation logic into a shared utility for better maintainability.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly makes the --nvidia_wheel_versions_data argument optional to support ROCm builds, which do not require it. The logic is moved into a new utility function validate_cuda_requirements that conditionally checks for the argument only when a CUDA build is requested. The implementation is clean and effective. I have one suggestion to add type hints to the new function for better code consistency and maintainability.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly makes the --nvidia_wheel_versions_data argument optional to allow for ROCm builds, while adding validation to ensure it's still provided for CUDA builds. The overall approach is sound. However, I've identified a high-severity issue in the new validation logic concerning the handling of the --enable-cuda flag, which could lead to incorrect build failures. Addressing this is important for the robustness of the build process.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly makes the nvidia_wheel_versions_data argument optional for ROCm builds by changing it from required=True to required=False and introducing a new validation function. This function, validate_cuda_requirements, correctly enforces the argument's presence only when a CUDA build is enabled. This is a clean and effective solution that improves the build experience for ROCm users. The changes are well-implemented and logical.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly makes the --nvidia_wheel_versions_data argument optional and adds validation to ensure it's provided only for CUDA builds. This is a good change that enables ROCm builds without requiring NVIDIA-specific data. The use of action="store_true" for boolean flags is also a nice improvement.

I've added a couple of suggestions to use argparse.add_mutually_exclusive_group for the --enable-cuda and --enable-rocm flags. This makes it explicit that only one can be chosen, and by making the group required, it fixes a latent bug in build_gpu_kernels_wheel.py where the script would fail with an UnboundLocalError if neither flag was provided. This also improves argument validation by failing early.

Overall, the changes are in the right direction, and with the suggested improvements, the argument parsing will be more robust.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly makes the --nvidia_wheel_versions_data argument optional, which is necessary for ROCm builds. The approach of making the argument optional in argparse and then adding a separate validation function for CUDA builds is clean and effective. The introduction of a mutually exclusive group for the backend selection flags (--enable-cuda and --enable-rocm) is also a great improvement that makes the command-line interface more robust.

I have a couple of suggestions for further improvement: one to refactor duplicated argument parsing logic to improve maintainability, and another to make an error message slightly more explicit for better usability.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly makes the --nvidia_wheel_versions_data argument optional for ROCm builds while keeping it mandatory for CUDA builds by introducing a new validation function. The changes also improve the command-line interface by making the --enable-cuda and --enable-rocm flags a required, mutually exclusive group. My review includes one suggestion to remove a block of code that has become unreachable as a result of these improvements.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly makes the --nvidia_wheel_versions_data argument optional for ROCm builds by making it conditionally required only for CUDA builds. The changes are implemented well by introducing a new validation function. A significant improvement is also made to the command-line argument parsing by making the --enable-cuda and --enable-rocm flags mutually exclusive and required, which enhances robustness. The code is clean and effective. I have one suggestion to refactor some duplicated code to improve long-term maintainability.

Comment on lines 56 to 73
gpu_backend_group = parser.add_mutually_exclusive_group(required=True)
gpu_backend_group.add_argument(
"--enable-cuda",
default=False,
action="store_true",
help="Should we build with CUDA enabled? Requires CUDA and CuDNN.")
parser.add_argument(
gpu_backend_group.add_argument(
"--enable-rocm",
default=False,
action="store_true",
help="Should we build with ROCM enabled?")
parser.add_argument(
"--srcs", help="source files for the wheel", action="append"
)
parser.add_argument(
"--nvidia_wheel_versions_data",
default=None,
required=True,
required=False,
help="NVIDIA wheel versions data",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of argument parsing logic is also present in jaxlib/tools/build_gpu_plugin_wheel.py. To improve maintainability and avoid code duplication, consider refactoring this common logic into a shared function in build_utils.py.

For example, you could add a function like add_gpu_build_arguments(parser) to build_utils.py:

# In jaxlib/tools/build_utils.py
import argparse

def add_gpu_build_arguments(parser: argparse.ArgumentParser):
  gpu_backend_group = parser.add_mutually_exclusive_group(required=True)
  gpu_backend_group.add_argument(
      "--enable-cuda",
      action="store_true",
      help="Should we build with CUDA enabled? Requires CUDA and CuDNN.")
  gpu_backend_group.add_argument(
      "--enable-rocm",
      action="store_true",
      help="Should we build with ROCM enabled?")
  parser.add_argument(
      "--srcs", help="source files for the wheel", action="append"
  )
  parser.add_argument(
      "--nvidia_wheel_versions_data",
      default=None,
      required=False,
      help="NVIDIA wheel versions data",
  )

Then, you could call this utility function from both this script and build_gpu_plugin_wheel.py to keep the argument definitions in one place.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the argument parsing for GPU build scripts into a shared utility function, which significantly reduces code duplication and improves maintainability. The main goal, making the --nvidia_wheel_versions_data argument optional for ROCm builds, is implemented cleanly by making the argument optional and then adding a separate validation step to ensure it's provided for CUDA builds. The use of a mutually exclusive group for --enable-cuda and --enable-rocm is also a nice improvement for argument validation. Overall, this is a great change. I have one suggestion to improve the user experience.

@phambinhfin
Copy link
Contributor Author

/gemini summary |  

@gemini-code-assist
Copy link
Contributor

Summary of Changes

This pull request streamlines the JAX GPU plugin build process by making NVIDIA-specific version data optional for ROCm builds. It achieves this by centralizing and improving argument parsing and validation logic within build_utils.py. This change enhances flexibility for different GPU environments while maintaining necessary checks for CUDA builds, ultimately simplifying the build configuration and preventing errors related to irrelevant arguments.

Highlights

  • NVIDIA Version Data Optional for ROCm: The --nvidia_wheel_versions_data argument has been made optional, specifically to allow ROCm builds to proceed without requiring NVIDIA-specific data.
  • Centralized Argument Parsing: Common GPU build arguments from build_gpu_kernels_wheel.py and build_gpu_plugin_wheel.py have been refactored into a new add_gpu_build_arguments function in build_utils.py.
  • CUDA Requirement Validation: A new validate_cuda_requirements function has been introduced in build_utils.py to explicitly enforce that --nvidia_wheel_versions_data is still required when building with CUDA.
  • Mutually Exclusive GPU Backends: The --enable-cuda and --enable-rocm arguments are now defined as a mutually exclusive group, ensuring that only one GPU backend can be selected during the build process.
Changelog
  • jaxlib/tools/build_gpu_kernels_wheel.py
    • Replaced individual parser.add_argument calls with a single call to build_utils.add_gpu_build_arguments(parser).
    • Added a call to build_utils.validate_cuda_requirements(parser, args) to ensure CUDA-specific argument validation.
  • jaxlib/tools/build_gpu_plugin_wheel.py
    • Replaced individual parser.add_argument calls with a single call to build_utils.add_gpu_build_arguments(parser).
    • Added a call to build_utils.validate_cuda_requirements(parser, args) for CUDA-specific argument validation.
    • Removed an unreachable else block that raised a ValueError for unsupported backends, as argument parsing now handles this via mutually exclusive groups.
  • jaxlib/tools/build_utils.py
    • Added import argparse to support new argument parsing utilities.
    • Introduced add_gpu_build_arguments function to encapsulate common argument definitions for GPU builds, including --output_path, --jaxlib_git_hash, --cpu, --platform_version, --editable, --enable-cuda, --enable-rocm, --srcs, and --nvidia_wheel_versions_data.
    • Modified --nvidia_wheel_versions_data to be required=False within add_gpu_build_arguments.
    • Configured --enable-cuda and --enable-rocm as a mutually_exclusive_group with action="store_true" and required=True.
    • Added validate_cuda_requirements function to check if --nvidia_wheel_versions_data is provided when --enable-cuda is active, raising an error if not.
Activity
  • The pull request was initially summarized by gemini-code-assist[bot].
  • The author phambinhfin repeatedly requested reviews using /gemini review.
  • Review comments from gemini-code-assist[bot] and JehandadKhan highlighted the need to maintain the --nvidia_wheel_versions_data requirement for CUDA builds.
  • Suggestions were made to refactor argument validation logic into build_utils.py to avoid duplication.
  • Feedback included adding type hints to the new validation function and ensuring --enable-cuda/--enable-rocm use action="store_true".
  • Further suggestions led to implementing argparse.add_mutually_exclusive_group for GPU backend selection and removing redundant default=False for store_true actions.
  • The help message for --nvidia_wheel_versions_data was suggested to be more descriptive.
  • Unreachable code related to backend validation was identified and suggested for removal after the introduction of mutually exclusive argument groups.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly makes the nvidia_wheel_versions_data argument optional for ROCm builds by refactoring the argument parsing logic into shared utility functions. This not only achieves the goal but also improves code maintainability by reducing duplication. The new validation logic correctly enforces that nvidia_wheel_versions_data is provided for CUDA builds. I have a couple of minor suggestions to add return type hints to the new functions for better code clarity and consistency.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the argument parsing for GPU wheel builds by centralizing the logic into build_utils.py, which significantly improves code maintainability by reducing duplication. The primary objective of making the nvidia_wheel_versions_data argument optional for ROCm builds has been successfully achieved by setting required=False and introducing a validation function to enforce the requirement specifically for CUDA builds. I have one comment on a minor inconsistency in an argument definition.

Comment on lines 209 to 214
parser.add_argument(
"--jaxlib_git_hash",
default="",
required=True,
help="Git hash. Empty if unknown. Optional.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The --jaxlib_git_hash argument is marked as required=True, but its help text describes it as "Optional". This is contradictory and can be confusing for users of the script. If the argument is not provided on the command line, the script will fail with a "missing argument" error, which is not what "Optional" implies. Given the default="" and how it's used, it seems the intention is for this argument to be optional. To fix this, you can remove required=True and simplify the help text.

  parser.add_argument(
      "--jaxlib_git_hash",
      default="",
      help="Git hash. Empty if unknown.",
  )

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly makes the nvidia_wheel_versions_data argument optional for ROCm builds by refactoring the argument parsing logic. The duplicated argument parsing code from build_gpu_kernels_wheel.py and build_gpu_plugin_wheel.py has been moved into a shared add_gpu_build_arguments function in build_utils.py. This is a great improvement for maintainability. The logic to conditionally require nvidia_wheel_versions_data only for CUDA builds is correctly implemented using a new validate_cuda_requirements function. Additionally, making --enable-cuda and --enable-rocm a required mutually exclusive group improves the robustness of the build scripts. I have one minor suggestion to improve the clarity of a help message for one of the arguments.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the argument parsing for GPU wheel build scripts into a shared utility module, which is a great improvement for code reuse and maintainability. The primary goal of making the nvidia_wheel_versions_data argument optional for ROCm builds is correctly implemented by making the argument not required by default and adding a validation function to enforce its presence only for CUDA builds. I have one suggestion to improve the clarity of a command-line argument definition.

@phambinhfin
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the command-line argument parsing for GPU build scripts by moving common arguments into a shared add_gpu_build_arguments function in build_utils.py. This is a good change that reduces code duplication. The core change, making the --nvidia_wheel_versions_data argument optional for ROCm builds, is correctly implemented by setting required=False and adding a separate validation function, validate_cuda_requirements, to enforce the argument's presence only for CUDA builds. Additionally, the introduction of a mutually exclusive group for --enable-cuda and --enable-rocm makes the build scripts more robust. My review includes one suggestion to improve the consistency of the help text for a command-line argument.

"--jaxlib_git_hash",
default="",
required=True,
help="Git hash. Empty if unknown.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with other required arguments in this function (e.g., --output_path, --cpu), it would be better to add "Required." to the end of this help string.

Suggested change
help="Git hash. Empty if unknown.",
help="Git hash. Empty if unknown. Required.",

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the argument parsing for GPU wheel builds by moving the common logic into a new add_gpu_build_arguments function in jaxlib/tools/build_utils.py. This is a good change that improves maintainability by reducing code duplication in build_gpu_kernels_wheel.py and build_gpu_plugin_wheel.py. The main goal of making the --nvidia_wheel_versions_data argument optional for ROCm builds is correctly implemented by making the argument not required by default and adding a new validate_cuda_requirements function to enforce its presence only for CUDA builds. The introduction of a mutually exclusive group for --enable-cuda and --enable-rocm is also a nice improvement that makes the command-line interface more robust. The changes are well-implemented and I have no suggestions for improvement.

@phambinhfin phambinhfin force-pushed the phambinh/ci_fix_build_nvidia branch from 95f527c to b8bc254 Compare November 14, 2025 12:47
@phambinhfin
Copy link
Contributor Author

@hawkinsp Could you please review this? I tried to follow the bot’s suggestions, but it ended up refactoring a lot of code, which doesn’t align with the main purpose of the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants