Replace bookie support downgrade to replace bookie with itself.#4013
Open
horizonzy wants to merge 13 commits intoapache:masterfrom
Open
Replace bookie support downgrade to replace bookie with itself.#4013horizonzy wants to merge 13 commits intoapache:masterfrom
horizonzy wants to merge 13 commits intoapache:masterfrom
Conversation
hangc0276
reviewed
Jul 3, 2023
...-server/src/main/java/org/apache/bookkeeper/client/RackawareEnsemblePlacementPolicyImpl.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
Show resolved
Hide resolved
zymap
reviewed
Jul 4, 2023
Member
There was a problem hiding this comment.
I am just thinking, this issue is related to how we exclude bookies. For example, we have 3 bookies and the E,W,Q is 3,3,2. So if we have lost a replica, then we exclude all bookies in the ensemble, we have no bookies to write. You want to choose the bookie from the failure bookie itself, so the simple way should be don't remove itself from the bookies. Then the remaining thing will complete by the existing logic, including the bookie writable check.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Descriptions of the changes in this PR:
Fixes #4012
This pr behavior:
When we want to replace a failed bookie with other bookies, if there are enough bookies, we can choose a new bookie to replace the failed bookie. But if there are no more bookies, we will check if the failed bookie is still alive. If alive, we can downgrade to replace the failed bookie with itself.
Example:
There are 4 bookies [0, 1, 2, 3]in the cluster, and the ledger ensemble is [0, 1, 2].
Case 1: We want to replace 0, the [0, 1, 2] will be excluded, then it will pick 3. The ledger new ensemble is [3, 1, 2].
Case 2: If the bookie 3 shutdown, there are only [0, 1, 2] in the cluster, we want to replace 0, the [0, 1, 2] will be excluded, there are no more bookies to select. Then we found the bookie 0 is still in the cluster, we pick 0. The ledger new ensemble is still [0, 1, 2].
Case 3: if the bookie 0, 2 shutdown, there are only [1, 2] in the cluster, we want to replace 0, the [0, 1, 2] will be excluded, there are no more bookies to select. Then we found the bookie 0 is not in the cluster, throw an exception.
This can bring benefits to the following cases: