Skip to content

fix: improve test suite reliability for CI#511

Merged
jordan-rash merged 3 commits intomainfrom
fix/test-suite-reliability
Mar 25, 2026
Merged

fix: improve test suite reliability for CI#511
jordan-rash merged 3 commits intomainfrom
fix/test-suite-reliability

Conversation

@jordan-rash
Copy link
Copy Markdown
Contributor

  • Add ReadyForConnections(5s) after every NATS server.Start() to prevent "nats: timeout" errors when connecting before server is ready
  • Fix broken shutdown guards that looped Shutdown() when NumClients==0 instead of always calling it; replace with simple defer s.Shutdown()
  • Add IsReady(timeout) polling helper to NexNode; replaces CPU-burning spin loops and sleep-poll patterns across all test callers
  • Add WaitFor polling helper to _test/helpers.go; replace fixed time.Sleep assertions in client and agent tests with condition polling
  • Remove unnecessary time.Sleep calls in cmd/nex/node_test.go where Run() is synchronous

- Add ReadyForConnections(5s) after every NATS server.Start() to prevent "nats: timeout" errors when connecting before server is ready
- Fix broken shutdown guards that looped Shutdown() when NumClients==0 instead of always calling it; replace with simple defer s.Shutdown()
- Add IsReady(timeout) polling helper to NexNode; replaces CPU-burning spin loops and sleep-poll patterns across all test callers
- Add WaitFor polling helper to _test/helpers.go; replace fixed time.Sleep assertions in client and agent tests with condition polling
- Remove unnecessary time.Sleep calls in cmd/nex/node_test.go where Run() is synchronous

Signed-off-by: Jordan Rash <jordan@synadia.com>
@jordan-rash jordan-rash requested a review from a team as a code owner March 25, 2026 18:06
- Fix data race in AgentRegistrations.startHealthMonitor: acquire rwLock.RLock before iterating Registrations map, which is concurrently written by Add()
- Wrap all client test Auction calls in WaitFor polling retry to handle transient RequestMany stall timeouts under heavy parallel CI load
- Add inline auction retry in inmem_test.go (cannot import _test helper due to circular dependency) and increase context timeout 3s→30s
- Increase IsReady timeout in StartNexus helper 10s→30s for parallel test scenarios on slow CI runners

Signed-off-by: Jordan Rash <jordan@synadia.com>
Signed-off-by: Jordan Rash <jordan@synadia.com>
@jordan-rash jordan-rash merged commit bc8c44c into main Mar 25, 2026
4 checks passed
@jordan-rash jordan-rash deleted the fix/test-suite-reliability branch April 2, 2026 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant