-
Notifications
You must be signed in to change notification settings - Fork 52
CASSSIDECAR-344: Sidecar endpoint for moving a node to a new token #275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
frankgh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general, I have some comments.
| * | ||
| * @param newToken the new token for the node to move to | ||
| */ | ||
| void move(String newToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to propagate any potential IOException from JMX and we should handle it
| void move(String newToken); | |
| void move(String newToken) throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the method signature. OperationalJob.execute() already handles the exception. I also added tests for node move failure.
| * @param newToken the string to validate as a BigInteger, can be null | ||
| * @return true if the string can be successfully parsed as a BigInteger, false if null or invalid | ||
| */ | ||
| public static boolean isValidBigInt(String newToken) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would defer validation to Cassandra, and handle the exception thrown by Cassandra. There are tokens that are not long-based or big integer based
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, OrderPreservingPartitioner is deprecated and token is numeric for all other partitioners. Please correct me if I am wrong.
server/src/main/java/org/apache/cassandra/sidecar/handlers/NodeMoveHandler.java
Show resolved
Hide resolved
| String body = context.body().asString(); | ||
| if (body == null || body.equalsIgnoreCase("null")) | ||
| { | ||
| logger.warn("Bad request. Received null payload."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a log entry will be logged by the parent class when we throw the exception, so I think we should remove this line:
| logger.warn("Bad request. Received null payload."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| @JsonCreator | ||
| public NodeMoveRequestPayload(@JsonProperty(value = "newToken", required = true) String newToken) | ||
| { | ||
| this.newToken = newToken; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should ensure the token is provided
| this.newToken = newToken; | |
| this.newToken = Objects.requireNonNull(newToken, "newToken is required"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| { | ||
| NodeMoveRequestPayload payload = Json.decodeValue(body, NodeMoveRequestPayload.class); | ||
| String newToken = payload.newToken(); | ||
| if (StringUtils.isNullOrEmpty(newToken) || !DataTypeUtils.isValidBigInt(newToken)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's defer validation of the token to Cassandra. Since some tokens might not necessarily be numeric, i.e. org.apache.cassandra.dht.OrderPreservingPartitioner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, OrderPreservingPartitioner is deprecated and token is numeric for all other partitioners. Please correct me if I am wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that's correct, but I think we shouldn't make assumptions from the Sidecar side. Maybe reducing the input to BigInteger is sufficient, but I prefer to defer validation to the partitioner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the validation of the token.
| } | ||
| catch (DecodeException e) | ||
| { | ||
| logger.warn("Bad request. Received invalid JSON payload."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be logged in the parent abstract class. No need to log it here
| logger.warn("Bad request. Received invalid JSON payload."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| { | ||
| private static final Logger LOGGER = LoggerFactory.getLogger(NodeMoveJob.class); | ||
| private static final String OPERATION = "move"; | ||
| private static final String OPERATION_MODE_MOVING = "MOVING"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to consider the org.apache.cassandra.service.StorageService.Mode#MOVE_FAILED mode as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just need to check if a move operation is in progress or not. Check if the status is "MOVING" accomplishes that.
| responseCode = "200", | ||
| content = @Content(mediaType = "application/json", | ||
| schema = @Schema(implementation = OperationalJobResponse.class))) | ||
| @APIResponse(description = "Node move operation initiated successfully", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we document the conflict scenario as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| assertThat(ring).isNotNull(); | ||
|
|
||
| RingEntry ringEntry = ring.stream() | ||
| .filter(entry -> entry.fqdn().equals(node)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this won't work if we had vnodes enabled for this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't enable vnodes for this test and it succeeds.
| @Singleton | ||
| public class NodeMoveHandler extends AbstractHandler<String> implements AccessProtected | ||
| { | ||
| private static final int MAX_TOKEN_LENGTH = 128; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT, document how you came up with value 128 for the maximum token length
frankgh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 looks good to me
sarankk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 Looks good to me, some minor nits.
| * @param newToken the new token for the node to move to | ||
| * @return a reference to this Builder | ||
| */ | ||
| public Builder nodeMoveRequest(String newToken) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: @NotNull annotation for newToken
| { | ||
| UUID jobId = UUID.randomUUID(); | ||
| String newToken = "123456789"; | ||
| String nodeMoveString = "{\"jobId\":\"" + jobId + "\",\"jobStatus\":\"SUCCEEDED\",\"instance\":\"127.0.0.1\"}"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Why do we return instance in response here? OperationalJobResponse seems to return only these 4 fields jobId, jobStatus, operation, reason
| .expect(ResponsePredicate.SC_OK) | ||
| .putHeader("content-type", "application/json") | ||
| .sendBuffer(io.vertx.core.buffer.Buffer.buffer(requestBody), context.succeeding(response -> { | ||
| assertThat(response.statusCode()).isEqualTo(OK.code()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: verify "jobStatus":"FAILED", "reason":"Simulated failure"
|
|
||
| // Validate the operational job status using the OperationalJobHandler | ||
| String jobId = responseBody.getString("jobId"); | ||
| validateOperationalJobStatus(jobId, "move", OperationalJobStatus.SUCCEEDED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarification question, does storageOps.move call succeed even if the node is in moving status?
Sidecar endpoint for moving a node to a new token
These changes are part of the effort to introduce bespoke Sidecar APIs to support key operational functionality currently managed through nodetool commands. Changes in the commit: