Skip to content

HDDS-1234. Short summary of the change#24

Closed
priyeshkaratha wants to merge 12 commits intomasterfrom
test_rocksdb
Closed

HDDS-1234. Short summary of the change#24
priyeshkaratha wants to merge 12 commits intomasterfrom
test_rocksdb

Conversation

@priyeshkaratha
Copy link
Copy Markdown
Owner

What changes were proposed in this pull request?

Provide a one-liner summary of the changes in the PR Title field above.
It should be in the form of HDDS-1234. Short summary of the change.

Please describe your PR in detail:

  • What changes are proposed in the PR? and Why? It would be better if it is written from third person's
    perspective not just for the reviewer.
  • Provide as much context and rationale for the pull request as possible. It could be copy-paste from
    the Jira's description if the jira is well defined.
  • If it is complex code, describe the approach used to solve the issue. If possible attach design doc,
    issue investigation, github discussion, etc.

Examples of well-written pull requests:

What is the link to the Apache JIRA

Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull
request which starts with the corresponding JIRA issue number. (e.g. HDDS-XXXX. Fix a typo in YYY.)

If you do not have an ASF Jira account yet, please follow the first-time contributor
instructions in the Jira guideline.

(Please replace this section with the link to the Apache JIRA)

How was this patch tested?

(Please explain how this patch was tested. Ex: unit tests, manual tests, workflow run on the fork git repo.)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this.)

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses critical concurrency issues related to database iterators and database closure. It introduces mechanisms to ensure that RocksDB iterators can gracefully handle scenarios where the underlying database is closed while an iteration is in progress, preventing potential native crashes and improving system stability. The changes also enhance the database closing process to wait for all active iterators to complete, and add comprehensive test coverage for these new safety measures.

Highlights

  • Robust Iterator Handling for Concurrent DB Closure: Implemented robust handling within RDBStoreAbstractIterator to prevent native crashes and ensure graceful termination of iteration when the underlying RocksDB database is closed concurrently.
  • Enhanced RocksDatabase Closure Management: Modified RocksDatabase to track active iterators, ensuring the database is not physically closed until all iterators have released their references, thereby preventing use-after-free issues.
  • Improved Container Scanner Resilience: Added new tests to TestBackgroundContainerDataScanner to validate graceful shutdown and clean exit of container data scanners when a volume fails or the underlying database is closed during iteration.
  • Thread-Safe and Null-Safe DB Store Closure: Updated AbstractRDBStore.close() to be synchronized and include a null check, improving its thread safety and preventing NullPointerException on multiple calls.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly enhances the robustness and correctness of RocksDB iterators, especially when dealing with concurrent database closure. The changes introduce a mechanism to prevent native crashes by ensuring that the underlying RocksDB instance is not physically closed while an iterator is still active. This is achieved through reference counting in RocksDatabase and explicit checks within RDBStoreAbstractIterator methods. Comprehensive new test cases have been added to validate these critical concurrent scenarios, ensuring graceful handling and improved stability.

}

private UncheckedAutoCloseable acquire() throws RocksDatabaseException {
UncheckedAutoCloseable acquire() throws RocksDatabaseException {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Changing the acquire() method's access modifier from private to package-private is a critical correctness change. This allows ManagedRocksIterator to acquire a reference to the RocksDatabase counter, which is essential for preventing the database from being physically closed while an iterator is still active. Without this, native crashes could occur.

Comment on lines +60 to +61
public static ManagedRocksIterator managed(RocksIterator iterator, UncheckedAutoCloseable dbRef) {
return new ManagedRocksIterator(iterator, dbRef);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The new managed factory method that accepts a dbRef is essential for integrating the new reference counting mechanism. This provides a clean way to create ManagedRocksIterator instances that correctly manage the underlying RocksDatabase lifecycle.

Comment on lines +46 to +53
public void close() {
try {
super.close();
} finally {
if (dbRef != null) {
dbRef.close();
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Modifying the close() method to release the dbRef in a finally block is a critical correctness fix. This ensures that the RocksDatabase counter is decremented, allowing the database to be physically closed once all active iterators have been released. This prevents resource leaks and native crashes.

Comment on lines +34 to +39
private final UncheckedAutoCloseable dbRef;

public ManagedRocksIterator(RocksIterator original, UncheckedAutoCloseable dbRef) {
super(original);
this.dbRef = dbRef;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The addition of the dbRef field and the constructor that accepts it is a critical correctness change. This allows the ManagedRocksIterator to hold a reference to the RocksDatabase counter, ensuring the database remains open while the iterator is in use. This directly addresses the TOCTOU race condition mentioned in the Javadoc.

Comment on lines +786 to 793
final UncheckedAutoCloseable ref = acquire();
try (ManagedReadOptions readOptions = new ManagedReadOptions()) {
readOptions.setFillCache(fillCache);
return managed(db.get().newIterator(family.getHandle(), readOptions));
return managed(db.get().newIterator(family.getHandle(), readOptions), ref);
} catch (RuntimeException e) {
ref.close();
throw e;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Similar to the previous comment, modifying newIterator(ColumnFamily family, boolean fillCache) to acquire and pass a dbRef is a critical correctness fix. This ensures that the RocksDatabase remains open as long as an iterator is active, preventing native crashes. The robust error handling with ref.close() in the catch block is also well-implemented.

Comment on lines +160 to +163
if (isDbClosed()) {
currentEntry = null;
return null;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The isDbClosed() check in seek() is a crucial correctness fix. This prevents native seek operations from being called on a closed database, which could result in a native crash. Returning null and setting currentEntry = null provides a safe exit.

Comment on lines +146 to +148
if (isDbClosed()) {
currentEntry = null;
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The isDbClosed() check in seekToLast() is a critical correctness addition. Attempting to perform a native seek operation on a closed database can lead to a crash, so this guard ensures safe behavior.

Comment on lines +96 to +101
if (isDbClosed()) {
LOG.warn("Stopping iterator for table {}: underlying RocksDB is closed",
rocksDBTable.getName());
currentEntry = null;
return;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The addition of the isDbClosed() check in setCurrentEntry() is a critical correctness improvement. This prevents potential native crashes by ensuring that the iterator gracefully stops processing if the underlying database has been closed, rather than attempting to access invalid native resources.

Comment on lines +122 to +124
if (!isDbClosed()) {
rocksDBIterator.get().next();
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The conditional call to rocksDBIterator.get().next() based on !isDbClosed() is an important correctness fix. This prevents the iterator from attempting to advance in the native RocksDB iterator if the database has already been closed, thereby avoiding potential native crashes.

Comment on lines +111 to +113
if (isDbClosed()) {
return false;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Adding the isDbClosed() check to hasNext() is a crucial correctness fix. This ensures that the iterator can fast-fail and return false immediately if the database is closed, preventing calls to rocksDBIterator.get().isValid() on a potentially invalid native object, which could lead to a crash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant