-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Revise LockWAL API and WAL collection #10953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c735776 to
50bcf44
Compare
|
cc @hermanlee |
|
Need to address the lock-order-inversion UPDATE: done |
|
According to comment: https://github.com/facebook/rocksdb/blob/main/db/db_impl/db_impl.h#L2290:L2291, application should always lock |
4a32763 to
f358248
Compare
|
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
|
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
473d439 to
6bafc70
Compare
|
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
|
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
|
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
a396691 to
d8754f4
Compare
|
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
|
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
d8754f4 to
50cae81
Compare
|
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
50cae81 to
ee51e95
Compare
|
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
|
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
ee51e95 to
63e257d
Compare
|
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
|
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
| } | ||
| } | ||
| } | ||
| log_write_mutex_.Lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look like it'd block writes like the original PR (#5146) claimed. Does it? I think explicitly blocking writes (like a write stop) may work better and be simpler.
I also fail to see how the backup worked before without calling DisableFileDeletions(). Did it work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also fail to see how the backup worked before without calling
DisableFileDeletions(). Did it work?
MyRocks backup tools do disable file deletions first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The #5146 PR claimed that: "Walk through the storage engines and lock writes on each engine. For
InnoDB, redo log is locked. For MyRocks, WAL should be locked."
My original impression is that WAL is locked, thus any writes with WAL will be blocked after this call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore, I thought introducing write stop is undesirable for users, but maybe that's over-optimizing for this use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original impression is that WAL is locked, thus any writes with WAL will be blocked after this call.
OK, I didn't have that impression by looking at how log_write_mutex_ is used. I will test it then since my concern isn't shared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking another look at https://github.com/facebook/rocksdb/blob/7.9.fb/db/db_impl/db_impl_write.cc#L1213. In fact, all threads calling PreprocessWrite() may be blocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It did not work originally but #7516 fixed it, probably without noticing. I still think we should re-think the approach as it feels unstable and unintuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also hesitant about exposing log_write_mutex_ to application.
Maybe similar to what innodb does, we don't need keep the log_write_mutex locked as long as we disable file deletion.
| const bool track_owner_ = false; | ||
|
|
||
| #if defined(OS_WIN) && !defined(_POSIX_THREADS) | ||
| std::atomic<port::ThreadId> owner_{kDummyOwner}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also need to account for cond var wait, etc.
|
Pursue #11020 |
RocksDB has two public APIs:
DB::LockWAL()/DB::UnlockWAL()which acquire/releaseDBImpl::log_write_mutex,DB::GetSortedWalFiles()which internally callsDBImpl::FindObsoleteFiles()which acquires/releasesDBImpl::log_write_mutex.According to the comment on
DBImpl::log_write_mutex_: https://github.com/facebook/rocksdb/blob/7.8.fb/db/db_impl/db_impl.h#L2287:L2288This puts limitations on how applications can use the
LockWAL()API. AfterLockWAL()returns ok, then applicationshould not perform any operation that acquires
mutex_. Currently, the use case ofLockWAL()is MyRocks implementingthe MySQL storage engine handlerton
lock_hton_loginterface. The operation that MyRocks performs afterLockWAL()is
GetSortedWalFiless()which not only acquires mutex_, but alsolog_write_mutex_.There are two issues:
GetSortedWalFiles()aftercalling
LockWAL()because log_write_mutex is not recursive.Fix for 1
To fix 1, we can figure out whether the calling thread of
FindObsoleteFiles()is holdinglog_write_mutex.Therefore, we need to track the thread currently holding a mutex. The tracking logic is put in
InstrumentedMutex.Note that
std::thread::idcan be reused in the same process if the prior thread finishes. We make sure a threadholding the mutex will reset
ownerto a default-constructedstd::thread::idbefore callingUnlock(). Also notethat
CondVar::Wait()releases the mutex without resettingowner. But keep in mind that when a thread is not sleeping,and the thread is holding the mutex, then
owneris set to the result of callingstd::this_thread::get_id().Fix for 2
To fix 2, we can update LockWAL() API interface and implementation so that
GetSortedWalFiles()toLockWAL()Note that application should make sure
UnLockWAL()is called if and only ifLockWAL()returns success.We also need to add a new API
DB::GetSortedWalFilesWithFileDeletionDisabled()which assumes that file deletionshas been disabled and no need to acquire any mutex_.
Test plan:
make check