-
Notifications
You must be signed in to change notification settings - Fork 116
Prevent parallel file lock clear #3714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
...ne/src/main/java/com/apple/foundationdb/record/lucene/directory/FDBDirectoryLockFactory.java
Outdated
Show resolved
Hide resolved
...ne/src/main/java/com/apple/foundationdb/record/lucene/directory/FDBDirectoryLockFactory.java
Outdated
Show resolved
Hide resolved
...ne/src/main/java/com/apple/foundationdb/record/lucene/directory/FDBDirectoryLockFactory.java
Show resolved
Hide resolved
...ne/src/main/java/com/apple/foundationdb/record/lucene/directory/FDBDirectoryLockFactory.java
Outdated
Show resolved
Hide resolved
FDBDirectory: When file lock clear is called in parallel by multiple threads, synchronically waiting for the first thread's clear to complete seems to be a better strategy. Resolves FoundationDB#3713
9b5b678 to
62dfa1b
Compare
62dfa1b to
e81f7d2
Compare
| closed = flushed; // allow close retry | ||
| closingContext = null; | ||
| synchronized (clearingLockNowLock) { | ||
| // clearing under lock to avoid spotbugsMain's "Inconsistent synchronization" issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be in a finally around all the code above?
i.e. what happens if the asyncToSync calls above fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
I've moved it to a fileLockFunc's whenComplete step.
| private void fileLockClearFlushAndClose(boolean isRecovery) { | ||
| synchronized (clearingLockNowLock) { | ||
| // Here: this function is being called from too many paths. Until cleanup, this guard is here to avoid recursions | ||
| if (clearingLockNow) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only seen it for lazy instantiation, but would it be worth to duplicate this before the synchronized lock to avoid entering the Synchronized lock if already clearing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can either do that or remove the synchronized lock altogether (since our concern is recursion, the synchronized lock might be an overkill...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed clearingLockNowLock - this will now only protect from recursion, but not from concurrency (which - AFAIK - does not happen).
| } | ||
| } | ||
|
|
||
| private void fileLockClearFlushAndClose(boolean isRecovery) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like there are some concurrency challenges with FDBDirectoryLock, it would probably be good to add some tests that try to clear this lock concurrently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that there are concurrent calls for this (within the same session) but recursive calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, in that case, it would be good to have tests that cause this to happen recursively. that might be very hard at the higher levels, but should be doable at this level, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot recreate it and wondering if one the two rebases eliminated the scenario. Switching to draft for a full unit testing ...
ohadzeliger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree to comment on concurrent close tests
FDBDirectory: When file lock clear is called in parallel by multiple threads, synchronically waiting for the first thread's clear to complete seems to be a better strategy.
Resolves #3713