Improve the failover efficiency #362

greatsharp · 2025-10-12T03:23:50Z

This enhancement can improve the failover efficiency when a lots of master node probe failed concurrently.

Since there is only one can UpdateCluster successfully each time, we should reset the failure count after UpdateCluster successfully, and can do failover later again, there is no need to wait more than 15 seconds(ping_interval_seconds*max_ping_count).

For example, if 3 master node doing failover concurrently, it will cost more than 45 seconds before, but now it is 18 seconds (12s + 3s + 3s).

BTW, we should enlarge the minIdleConns to each node, this can save time in acquiring new connections when doing probe and sync cluster topology to all nodes, 10 is enough.

git-hulk · 2025-10-12T03:39:56Z

@greatsharp Thanks for your improvement.

codecov-commenter · 2025-10-12T03:47:30Z

Codecov Report

❌ Patch coverage is 50.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 47.11%. Comparing base (6c56470) to head (0bfb1d2).
⚠️ Report is 97 commits behind head on unstable.

Files with missing lines	Patch %	Lines
controller/cluster.go	50.00%	5 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable     #362      +/-   ##
============================================
+ Coverage     43.38%   47.11%   +3.72%     
============================================
  Files            37       45       +8     
  Lines          2971     4453    +1482     
============================================
+ Hits           1289     2098     +809     
- Misses         1544     2147     +603     
- Partials        138      208      +70

Flag	Coverage Δ
unittests	`47.11% <50.00%> (+3.72%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Improve the failover efficiency

0bfb1d2

git-hulk approved these changes Oct 12, 2025

View reviewed changes

git-hulk merged commit 54a7ac4 into apache:unstable Oct 12, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve the failover efficiency #362

Improve the failover efficiency #362

Uh oh!

greatsharp commented Oct 12, 2025

Uh oh!

git-hulk commented Oct 12, 2025

Uh oh!

codecov-commenter commented Oct 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Improve the failover efficiency #362

Improve the failover efficiency #362

Uh oh!

Conversation

greatsharp commented Oct 12, 2025

Uh oh!

git-hulk commented Oct 12, 2025

Uh oh!

codecov-commenter commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Oct 12, 2025 •

edited

Loading