HDDS-1234. Short summary of the change#24
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses critical concurrency issues related to database iterators and database closure. It introduces mechanisms to ensure that RocksDB iterators can gracefully handle scenarios where the underlying database is closed while an iteration is in progress, preventing potential native crashes and improving system stability. The changes also enhance the database closing process to wait for all active iterators to complete, and add comprehensive test coverage for these new safety measures. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request significantly enhances the robustness and correctness of RocksDB iterators, especially when dealing with concurrent database closure. The changes introduce a mechanism to prevent native crashes by ensuring that the underlying RocksDB instance is not physically closed while an iterator is still active. This is achieved through reference counting in RocksDatabase and explicit checks within RDBStoreAbstractIterator methods. Comprehensive new test cases have been added to validate these critical concurrent scenarios, ensuring graceful handling and improved stability.
| } | ||
|
|
||
| private UncheckedAutoCloseable acquire() throws RocksDatabaseException { | ||
| UncheckedAutoCloseable acquire() throws RocksDatabaseException { |
There was a problem hiding this comment.
Changing the acquire() method's access modifier from private to package-private is a critical correctness change. This allows ManagedRocksIterator to acquire a reference to the RocksDatabase counter, which is essential for preventing the database from being physically closed while an iterator is still active. Without this, native crashes could occur.
| public static ManagedRocksIterator managed(RocksIterator iterator, UncheckedAutoCloseable dbRef) { | ||
| return new ManagedRocksIterator(iterator, dbRef); |
| public void close() { | ||
| try { | ||
| super.close(); | ||
| } finally { | ||
| if (dbRef != null) { | ||
| dbRef.close(); | ||
| } | ||
| } |
There was a problem hiding this comment.
Modifying the close() method to release the dbRef in a finally block is a critical correctness fix. This ensures that the RocksDatabase counter is decremented, allowing the database to be physically closed once all active iterators have been released. This prevents resource leaks and native crashes.
| private final UncheckedAutoCloseable dbRef; | ||
|
|
||
| public ManagedRocksIterator(RocksIterator original, UncheckedAutoCloseable dbRef) { | ||
| super(original); | ||
| this.dbRef = dbRef; | ||
| } |
There was a problem hiding this comment.
The addition of the dbRef field and the constructor that accepts it is a critical correctness change. This allows the ManagedRocksIterator to hold a reference to the RocksDatabase counter, ensuring the database remains open while the iterator is in use. This directly addresses the TOCTOU race condition mentioned in the Javadoc.
| final UncheckedAutoCloseable ref = acquire(); | ||
| try (ManagedReadOptions readOptions = new ManagedReadOptions()) { | ||
| readOptions.setFillCache(fillCache); | ||
| return managed(db.get().newIterator(family.getHandle(), readOptions)); | ||
| return managed(db.get().newIterator(family.getHandle(), readOptions), ref); | ||
| } catch (RuntimeException e) { | ||
| ref.close(); | ||
| throw e; | ||
| } |
There was a problem hiding this comment.
Similar to the previous comment, modifying newIterator(ColumnFamily family, boolean fillCache) to acquire and pass a dbRef is a critical correctness fix. This ensures that the RocksDatabase remains open as long as an iterator is active, preventing native crashes. The robust error handling with ref.close() in the catch block is also well-implemented.
| if (isDbClosed()) { | ||
| currentEntry = null; | ||
| return null; | ||
| } |
| if (isDbClosed()) { | ||
| currentEntry = null; | ||
| return; |
| if (isDbClosed()) { | ||
| LOG.warn("Stopping iterator for table {}: underlying RocksDB is closed", | ||
| rocksDBTable.getName()); | ||
| currentEntry = null; | ||
| return; | ||
| } |
There was a problem hiding this comment.
| if (!isDbClosed()) { | ||
| rocksDBIterator.get().next(); | ||
| } |
There was a problem hiding this comment.
| if (isDbClosed()) { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
What changes were proposed in this pull request?
Provide a one-liner summary of the changes in the PR Title field above.
It should be in the form of
HDDS-1234. Short summary of the change.Please describe your PR in detail:
perspective not just for the reviewer.
the Jira's description if the jira is well defined.
issue investigation, github discussion, etc.
Examples of well-written pull requests:
What is the link to the Apache JIRA
Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull
request which starts with the corresponding JIRA issue number. (e.g. HDDS-XXXX. Fix a typo in YYY.)
If you do not have an ASF Jira account yet, please follow the first-time contributor
instructions in the Jira guideline.
(Please replace this section with the link to the Apache JIRA)
How was this patch tested?
(Please explain how this patch was tested. Ex: unit tests, manual tests, workflow run on the fork git repo.)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this.)