Add retry logic for Hermit initialization #3775

gurevichdmitry · 2025-12-17T16:31:52Z

Summary of your changes

There is no built-in GitHub Actions mechanism to retry a step on failure. Since this is a relatively simple use case, it is preferable to handle retries with a custom script rather than introducing a third-party action that would require additional version management and maintenance.

This PR introduces a retry script and applies it to the Hermit initialization command to make the CI workflows more resilient to transient failures.

mergify · 2025-12-17T16:32:39Z

This pull request does not have a backport label. Could you fix it @gurevichdmitry? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-v./d./d./d is the label to automatically backport to the 8./d branch. /d is the digit
backport-active-all is the label that automatically backports to all active branches.
backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

…loudbeat into dg-add-retry-to-hermit

mergify · 2025-12-23T23:27:52Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b dg-add-retry-to-hermit upstream/dg-add-retry-to-hermit
git merge upstream/main
git push upstream dg-add-retry-to-hermit

Copilot

Pull request overview

This PR introduces retry logic for Hermit initialization to make CI workflows more resilient to transient failures. Instead of using third-party retry actions, it implements a custom retry script with exponential backoff.

Key changes:

Adds a new retry wrapper script (.ci/scripts/retry.sh) with configurable retry attempts and exponential backoff
Integrates retry logic into the Hermit initialization step in the hermit composite action
Consolidates workflow files to use the centralized hermit action instead of inline initialization steps

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`.ci/scripts/retry.sh`	New retry wrapper script with exponential backoff for handling transient command failures
`.github/actions/hermit/action.yml`	Integrates retry script into Hermit initialization and updates poetry install configuration
`.github/workflows/ci-pull_request.yml`	Migrates to use centralized hermit action for initialization
`.github/workflows/binary-size-monitor.yml`	Migrates to use centralized hermit action for initialization
`.github/workflows/packaging.yml`	Migrates to use centralized hermit action for initialization
`.github/workflows/test-environment.yml`	Migrates to use centralized hermit action for initialization
`.github/workflows/test-opa-coverage.yml`	Migrates to use centralized hermit action for initialization
`.github/workflows/test-opa-policies.yml`	Migrates to use centralized hermit action for initialization
`.github/workflows/updatecli.yml`	Migrates to use centralized hermit action for initialization
`.github/workflows/upgrade-environment.yml`	Migrates to use centralized hermit action for initialization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-24T09:29:22Z

.ci/scripts/retry.sh

+    if eval "$command"; then
+        echo "Command succeeded on attempt $attempt"
+        exit 0
+    fi
+
+    exit_code=$?


The exit code capture at line 41 will always be 0 because the preceding 'if' statement succeeded (even though the command inside it failed). The exit code should be captured immediately after eval on line 36, or the 'if' statement logic should be inverted. Currently, this bug prevents the actual command exit code from being reported when all retries are exhausted.

Suggested change

if eval "$command"; then

echo "Command succeeded on attempt $attempt"

exit 0

fi

exit_code=$?

eval "$command"

exit_code=$?

if [ $exit_code -eq 0 ]; then

echo "Command succeeded on attempt $attempt"

exit 0

fi

Copilot · 2025-12-24T09:29:22Z

.github/actions/hermit/action.yml

+      run: |
+        max_retries=5
+        retry_delay=10
+        ./.ci/scripts/retry.sh "$max_retries" "$retry_delay" './bin/hermit env --raw >> "$GITHUB_ENV"'


The command passed to retry.sh contains shell redirection ('>> "$GITHUB_ENV"') that will be evaluated in the retry script's context, not in the hermit command's context. This means the output redirection happens after the eval, potentially causing issues. The command should be restructured to ensure the redirection is properly handled within the retry mechanism, or the retry script should be modified to handle output redirection correctly.

Suggested change

./.ci/scripts/retry.sh "$max_retries" "$retry_delay" './bin/hermit env --raw >> "$GITHUB_ENV"'

./.ci/scripts/retry.sh "$max_retries" "$retry_delay" './bin/hermit env --raw' >> "$GITHUB_ENV"

uri-weisman · 2025-12-24T09:42:55Z

I understand those changes allows us to retry on a single steps, but other steps might also be flaky and we might not want to apply the retry block for all of them.
Should we consider using rerun-flaky-workflows to automatically re-run the GitHub Action if it fails?

gurevichdmitry added 2 commits December 17, 2025 18:18

add retry script

69abbe2

replace bash with action

d95916f

gurevichdmitry requested a review from a team as a code owner December 17, 2025 16:31

mergify bot assigned gurevichdmitry Dec 17, 2025

gurevichdmitry added 13 commits December 17, 2025 18:47

update workflows to use hermit action

ba97188

Merge branch 'main' into dg-add-retry-to-hermit

ff466b6

update workflow steps

b6f9072

Merge branch 'dg-add-retry-to-hermit' of https://github.com/elastic/c…

5c64e1c

…loudbeat into dg-add-retry-to-hermit

add installing test package

7b4fe1d

remove redundant steps

c99d420

update action input params

6ed870c

update action input params

b97368b

remove step

5c3cdf4

Merge branch 'main' into dg-add-retry-to-hermit

63dc998

updata hermit action input params

897a532

Merge branch 'dg-add-retry-to-hermit' of https://github.com/elastic/c…

1fdac36

…loudbeat into dg-add-retry-to-hermit

Merge branch 'main' into dg-add-retry-to-hermit

f1bda00

Merge branch 'main' into dg-add-retry-to-hermit

510da55

uri-weisman requested a review from Copilot December 24, 2025 09:25

Copilot started reviewing on behalf of uri-weisman December 24, 2025 09:25 View session

Copilot AI reviewed Dec 24, 2025

View reviewed changes

fix review comments

0db0ea0

uri-weisman approved these changes Dec 25, 2025

View reviewed changes

gurevichdmitry added 2 commits December 25, 2025 10:54

fix failed command

fdc8bcf

Merge branch 'main' into dg-add-retry-to-hermit

3650516

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add retry logic for Hermit initialization #3775

Add retry logic for Hermit initialization #3775

gurevichdmitry commented Dec 17, 2025

Uh oh!

mergify bot commented Dec 17, 2025

Uh oh!

mergify bot commented Dec 23, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 24, 2025

Uh oh!

Copilot AI Dec 24, 2025

Uh oh!

uri-weisman commented Dec 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	./.ci/scripts/retry.sh "$max_retries" "$retry_delay" './bin/hermit env --raw >> "$GITHUB_ENV"'
	./.ci/scripts/retry.sh "$max_retries" "$retry_delay" './bin/hermit env --raw' >> "$GITHUB_ENV"

Add retry logic for Hermit initialization #3775

Are you sure you want to change the base?

Add retry logic for Hermit initialization #3775

Conversation

gurevichdmitry commented Dec 17, 2025

Summary of your changes

Uh oh!

mergify bot commented Dec 17, 2025

Uh oh!

mergify bot commented Dec 23, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

uri-weisman commented Dec 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants