Skip to content

Conversation

@jjezra
Copy link
Contributor

@jjezra jjezra commented Oct 31, 2025

FDBDirectory: When file lock clear is called in parallel by multiple threads, synchronically waiting for the first thread's clear to complete seems to be a better strategy.

Resolves #3713

@jjezra jjezra requested a review from ohadzeliger October 31, 2025 18:32
@jjezra jjezra added the bug fix Change that fixes a bug label Oct 31, 2025
@jjezra jjezra marked this pull request as ready for review November 3, 2025 13:42
@ScottDugas ScottDugas requested review from ScottDugas and removed request for ScottDugas November 3, 2025 17:50
@jjezra jjezra requested a review from ScottDugas November 3, 2025 17:53
  FDBDirectory: When file lock clear is called in parallel by multiple threads, synchronically waiting for the first thread's clear to complete seems to be a better strategy.

  Resolves FoundationDB#3713
@jjezra jjezra force-pushed the lucene_block_parallel_lock_clear branch from 9b5b678 to 62dfa1b Compare November 11, 2025 17:12
@jjezra jjezra requested a review from ohadzeliger November 11, 2025 17:13
@jjezra jjezra force-pushed the lucene_block_parallel_lock_clear branch from 62dfa1b to e81f7d2 Compare November 11, 2025 18:28
closed = flushed; // allow close retry
closingContext = null;
synchronized (clearingLockNowLock) {
// clearing under lock to avoid spotbugsMain's "Inconsistent synchronization" issue.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in a finally around all the code above?
i.e. what happens if the asyncToSync calls above fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!
I've moved it to a fileLockFunc's whenComplete step.

private void fileLockClearFlushAndClose(boolean isRecovery) {
synchronized (clearingLockNowLock) {
// Here: this function is being called from too many paths. Until cleanup, this guard is here to avoid recursions
if (clearingLockNow) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only seen it for lazy instantiation, but would it be worth to duplicate this before the synchronized lock to avoid entering the Synchronized lock if already clearing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can either do that or remove the synchronized lock altogether (since our concern is recursion, the synchronized lock might be an overkill...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed clearingLockNowLock - this will now only protect from recursion, but not from concurrency (which - AFAIK - does not happen).

}
}

private void fileLockClearFlushAndClose(boolean isRecovery) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like there are some concurrency challenges with FDBDirectoryLock, it would probably be good to add some tests that try to clear this lock concurrently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that there are concurrent calls for this (within the same session) but recursive calls.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, in that case, it would be good to have tests that cause this to happen recursively. that might be very hard at the higher levels, but should be doable at this level, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot recreate it and wondering if one the two rebases eliminated the scenario. Switching to draft for a full unit testing ...

Copy link
Contributor

@ohadzeliger ohadzeliger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree to comment on concurrent close tests

@jjezra jjezra requested a review from ScottDugas November 17, 2025 16:30
@jjezra jjezra marked this pull request as draft November 17, 2025 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix Change that fixes a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prevent parallel file lock clear

3 participants