-
Notifications
You must be signed in to change notification settings - Fork 31.3k
Fix kernel mapping bug when multiple devices exist (Issue #42451) #42486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ada-ggf25
wants to merge
6
commits into
huggingface:main
Choose a base branch
from
ada-ggf25:Guilherme_Grancho
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…apping Previously, the add_to_mapping function would overwrite existing layer_name and device entries in the compatible_mapping dictionary when adding new entries. This could cause loss of existing kernel mappings. The fix ensures that: - Layer name entries are preserved if they already exist - Device entries within layer names are preserved if they already exist - Only mode entries can be overwritten (which is the intended behaviour) This makes the function more robust when building compatible mappings incrementally, preventing accidental data loss.
Add comprehensive test coverage for the add_to_mapping function to ensure it correctly handles multiple devices and modes without overwriting existing entries in the compatible_mapping dictionary. The new tests verify: - Multiple devices can be added for the same layer_name without overwriting each other (test_add_to_mapping_multiple_devices) - Single device mappings are created correctly (test_add_to_mapping_single_device) - Multiple modes can be added for the same device without overwriting each other (test_add_to_mapping_multiple_modes) These tests complement the fix in kernel_config.py that prevents overwriting existing mappings when building compatible mappings incrementally.
Apply code formatting improvements to the add_to_mapping test functions: - Break long function calls across multiple lines to comply with line length guidelines - Add noqa comments to suppress line length warnings for docstrings - Fix test assertions to use the correct private attribute _repo_id instead of the non-existent public repo_id attribute This ensures the tests follow the project's code style guidelines and correctly verify the LayerRepository object properties.
Reorganise imports to group transformers.utils imports together by moving add_to_mapping import to be with other utils imports rather than between hub_kernels and masking_utils imports. Consolidate add_to_mapping function calls back to single lines as they fit within the line length limits, improving code readability.
Remove trailing whitespace from blank lines in the add_to_mapping function to comply with code style guidelines and ensure consistent formatting.
4 tasks
Member
|
cc @MekkCyber |
MekkCyber
reviewed
Dec 1, 2025
Contributor
MekkCyber
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @ada-ggf25 ! The issue is being handled here #42466. Sorry about that 🙏
Author
No worries. Happy to help! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix kernel mapping bug when multiple devices exist (Issue #42451)
Fixes #42451
Summary
Fixes issue #42451 where
add_to_mapping()would overwrite existing device entries whenkernel_mappingcontained multiplsame layer (e.g., both "cuda" and "rocm").
Problem
When
kernel_mappingcontains multiple devices for the same layer, callingadd_to_mapping()repeatedly would overwrite tentry instead of adding to it. This occurred because the function was completely replacing
compatible_mapping[layer_name]Example of the bug:
Solution
Updated
add_to_mapping()to check iflayer_nameanddeviceentries already exist before overwriting. The function nowlayer_nameentry if it doesn't existdeviceentry if it doesn't existThis ensures all devices are preserved in the
compatible_mappingstructure.Changes
Modified:
src/transformers/utils/kernel_config.pyadd_to_mapping()function to preserve existing device entriesAdded:
tests/kernels/test_kernels.pytest_add_to_mapping_multiple_devices: Verifies multiple devices are preservedtest_add_to_mapping_single_device: Ensures backward compatibilitytest_add_to_mapping_multiple_modes: Verifies multiple modes work correctlyTesting
All tests pass:
test_add_to_mapping_multiple_devices- PASSEDtest_add_to_mapping_multiple_modes- PASSEDtest_add_to_mapping_single_device- PASSEDTestKernelUtilities- PASSED