Skip to content

Issue 138128 failure#1

Draft
benchaplin wants to merge 1 commit intomainfrom
cursor/issue-138128-failure-93ae
Draft

Issue 138128 failure#1
benchaplin wants to merge 1 commit intomainfrom
cursor/issue-138128-failure-93ae

Conversation

@benchaplin
Copy link
Copy Markdown
Owner

This PR addresses the flakiness of SearchWithRandomDisconnectsIT.testSearchWithRandomDisconnects (closes elastic#138128), which was observed to time out or hang on CI.

The test had two main issues:

  1. Timeout: The disruption loop, which runs 50-100 iterations each followed by ensureFullyConnectedCluster(), lacked a time bound. On slower CI environments, this could cause the test to exceed the 20-minute suite timeout.
  2. Hang: During cluster teardown (often triggered by the suite timeout), runMoreSearches() could call prepareRandomSearch(), which would then throw a RuntimeException("Cluster is already closed"). This uncaught exception prevented finishFuture from completing, leading to future.get() blocking indefinitely.

The fix implements the following changes:

  • Added a try-catch block around prepareRandomSearch().execute(f) within runMoreSearches() to ensure that finishFuture is always completed, even if prepareRandomSearch() throws an exception (e.g., due to a closed cluster). This prevents the test from hanging.
  • Introduced a 5-minute time limit for the disruption loop to ensure it completes within a reasonable timeframe and does not contribute to exceeding the overall suite timeout.
  • Unmuted the test in muted-tests.yml.

Open in Web Open in Cursor 

Add try-catch in runMoreSearches() to handle exceptions from
prepareRandomSearch() (e.g. cluster closed during teardown),
ensuring finishFuture is always completed and preventing hangs.

Add a 5-minute time limit on the disruption loop to prevent the
test from exceeding the 20-minute suite timeout on slow CI nodes.

Unmute the test.

Closes elastic#138128

Co-authored-by: Ben Chaplin <benchaplin@protonmail.com>
@cursor
Copy link
Copy Markdown

cursor bot commented Feb 25, 2026

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] SearchWithRandomDisconnectsIT testSearchWithRandomDisconnects failing

2 participants