Fix core dumps triggered by rocksdb compacting when shutdown bk#4706
Fix core dumps triggered by rocksdb compacting when shutdown bk#4706AnonHxy wants to merge 5 commits intoapache:masterfrom
Conversation
560ace4 to
bc91be9
Compare
| while (!entryLocationIndex.compareAndSetCompacting(false, true)) { | ||
| Thread.sleep(100); | ||
| } |
There was a problem hiding this comment.
Shouldn't it be done in the close method in the EntryLocationIndex?
ecaea19 to
7018ea1
Compare
1556a35 to
abee9aa
Compare
| @Override | ||
| public void close() throws IOException { | ||
| log.info("Closing EntryLocationIndex"); | ||
| while (!compacting.compareAndSet(false, true)) { |
There was a problem hiding this comment.
This waiting could result in the system being stuck here indefinitely, or it could take an exceptionally long time to get stuck at this step.
Should we add a maximum waiting time, or do something else, such as modifying RocksDB operations?
There was a problem hiding this comment.
I dont think so. Closing a DB which in compaction status may triger core dumps or other unexpected error. If closing the DB cost too long time, I think we'd better find out why or kill -9 if we need.
The handling approach here is similar to the procedure below:
| public void compact() throws IOException { | ||
| try { | ||
| isCompacting = true; | ||
| if (!compacting.compareAndSet(false, true)) { |
There was a problem hiding this comment.
I have a question:
Is the isCompacting syntax the root cause of this problem?
Or is it simply a matter of rewriting the method more efficiently, thus ruling out issues with the isCompacting setting?
There was a problem hiding this comment.
Yes. It's the root cause. We should cancel the compacting if we have closed the DB, or we should delay closing the DB if we have already been in compaction status. So here we need an atomic variable to serve as a flag for the DB status. @StevenLuMT
Descriptions of the changes in this PR:
Fix #4705
Motivation
Fix #4705
Changes
1.Setting the
compactingflag ofentryLocationIndexas true when shutdownLedgerStorageto stopping the subsequentcompact.2. When
LedgerStorageshutdown we will wait unit thecompactend.