Skip to content

Conversation

@dragonJACson
Copy link
Contributor

or_insert() eagerly evaluates the default argument even when the entry exists, causing a new DeviceContext to be created and immediately dropped. This triggers ibv_close_device() on the shared verbs context, corrupting the cached entry.

Switch to or_insert_with() so the DeviceContext is only constructed when the HashMap entry is missing.

@gemini-code-assist
Copy link

Summary of Changes

Hello @dragonJACson, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a subtle but critical bug in the rdmacm module related to the lifecycle management of DeviceContext objects. By transitioning to a lazy initialization pattern for cached device contexts, the change prevents erroneous object creation and subsequent corruption of shared resources. The inclusion of a dedicated test case ensures the caching mechanism now behaves as expected, enhancing the stability and correctness of the system.

Highlights

  • Bug Fix: Resolved a critical bug where DeviceContext instances were eagerly created and immediately dropped when using or_insert(), leading to corruption of cached entries due to ibv_close_device() being called prematurely.
  • Performance/Correctness Improvement: Switched from HashMap::entry().or_insert() to or_insert_with() to ensure that DeviceContext is only constructed when its entry is actually missing in the HashMap, preventing unnecessary object creation and destruction.
  • New Test Case: Added a new test, test_get_device_context_caches_correctly, to explicitly verify that DeviceContext instances are correctly cached and reused, confirming the fix's effectiveness.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

…eation

`or_insert()` eagerly evaluates the default argument even when the entry
exists, causing a new `DeviceContext` to be created and immediately
dropped. This triggers `ibv_close_device()` on the shared verbs context,
corrupting the cached entry.

Switch to `or_insert_with()` so the `DeviceContext` is only constructed
when the `HashMap` entry is missing.

Signed-off-by: Luke Yue <lukedyue@gmail.com>
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a critical bug where or_insert() was causing eager creation and destruction of DeviceContext, leading to corruption of the shared verbs context. The switch to or_insert_with() is the correct approach to ensure the DeviceContext is only created when necessary. The addition of the test_get_device_context_caches_correctly test is excellent, as it properly verifies the fix and prevents regressions. I have a couple of suggestions to improve the code's readability and efficiency.

Comment on lines +850 to +855
let device_ctx = guard.entry((*cm_id.as_ptr()).verbs as usize).or_insert_with(|| {
Arc::new(DeviceContext {
// Safe due to the is_null() check above.
context: NonNull::new((*cm_id.as_ptr()).verbs).unwrap(),
}));
})
});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better readability and to adhere to common Rust formatting conventions, it's recommended to format this chained method call across multiple lines. This makes the code easier to parse visually, especially with the closure.

Suggested change
let device_ctx = guard.entry((*cm_id.as_ptr()).verbs as usize).or_insert_with(|| {
Arc::new(DeviceContext {
// Safe due to the is_null() check above.
context: NonNull::new((*cm_id.as_ptr()).verbs).unwrap(),
}));
})
});
let device_ctx = guard
.entry((*cm_id.as_ptr()).verbs as usize)
.or_insert_with(|| {
Arc::new(DeviceContext {
// Safe due to the is_null() check above.
context: NonNull::new((*cm_id.as_ptr()).verbs).unwrap(),
})
});

Comment on lines +1183 to +1190
assert!(
Arc::ptr_eq(&ctx1.clone().unwrap(), &ctx2.clone().unwrap()),
"ctx1 and ctx2 should point to the same DeviceContext"
);
assert!(
Arc::ptr_eq(&ctx2.clone().unwrap(), &ctx3.clone().unwrap()),
"ctx2 and ctx3 should point to the same DeviceContext"
);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To make the code more idiomatic and efficient, you can use as_ref() instead of clone() when comparing the Arc pointers. as_ref() borrows the content of the Option without incrementing the Arc's reference count, which is more efficient as you only need a reference for Arc::ptr_eq.

Suggested change
assert!(
Arc::ptr_eq(&ctx1.clone().unwrap(), &ctx2.clone().unwrap()),
"ctx1 and ctx2 should point to the same DeviceContext"
);
assert!(
Arc::ptr_eq(&ctx2.clone().unwrap(), &ctx3.clone().unwrap()),
"ctx2 and ctx3 should point to the same DeviceContext"
);
assert!(
Arc::ptr_eq(ctx1.as_ref().unwrap(), ctx2.as_ref().unwrap()),
"ctx1 and ctx2 should point to the same DeviceContext"
);
assert!(
Arc::ptr_eq(ctx2.as_ref().unwrap(), ctx3.as_ref().unwrap()),
"ctx2 and ctx3 should point to the same DeviceContext"
);

@codecov
Copy link

codecov bot commented Dec 12, 2025

Codecov Report

❌ Patch coverage is 93.54839% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/rdmacm/communication_manager.rs 93.54% 2 Missing ⚠️
Files with missing lines Coverage Δ
src/rdmacm/communication_manager.rs 89.16% <93.54%> (+1.56%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants