fix: improve node selection in sticky session #575
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Fixed the bug where chproxy always selects nodes only from a small subset of all nodes, for any random
session_id.Additionally, resolved performance bottlenecks caused by inefficient usage
hash(sessionId)within loops.Closes #574
Pull request type
Please check the type of change your PR introduces:
Checklist
Does this introduce a breaking change?
Further comments
By design, sticky sessions should consistently route all requests with the same
session_idto the exact same node, regardless of whether the node is active or not. However, the current implementation fails to maintain this consistency when the selected node's active status changes during this period. Resolving this issue presents challenges, especially in chproxy topologies with 2 or more replicas, which may require introducing distributed storage solutions like Redis. I have addedTODOin the code and will open another issue about this.127.0.1.1based on itssession_id. If127.0.1.1later becomes inactive, subsequent requests with the samesession_idare incorrectly rerouted to another active node (e.g.,127.0.2.2) instead of remaining directed to127.0.1.1.127.0.1.1but initially selects127.0.2.2because127.0.1.1is inactive. When127.0.1.1later becomes active, subsequent requests with the samesession_idare incorrectly switched to127.0.1.1, breaking session stickiness.