-
Notifications
You must be signed in to change notification settings - Fork 42
Add retry logic for Hermit initialization #3775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This pull request does not have a backport label. Could you fix it @gurevichdmitry? 🙏
|
|
This pull request is now in conflicts. Could you fix it? 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces retry logic for Hermit initialization to make CI workflows more resilient to transient failures. Instead of using third-party retry actions, it implements a custom retry script with exponential backoff.
Key changes:
- Adds a new retry wrapper script (
.ci/scripts/retry.sh) with configurable retry attempts and exponential backoff - Integrates retry logic into the Hermit initialization step in the hermit composite action
- Consolidates workflow files to use the centralized hermit action instead of inline initialization steps
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
.ci/scripts/retry.sh |
New retry wrapper script with exponential backoff for handling transient command failures |
.github/actions/hermit/action.yml |
Integrates retry script into Hermit initialization and updates poetry install configuration |
.github/workflows/ci-pull_request.yml |
Migrates to use centralized hermit action for initialization |
.github/workflows/binary-size-monitor.yml |
Migrates to use centralized hermit action for initialization |
.github/workflows/packaging.yml |
Migrates to use centralized hermit action for initialization |
.github/workflows/test-environment.yml |
Migrates to use centralized hermit action for initialization |
.github/workflows/test-opa-coverage.yml |
Migrates to use centralized hermit action for initialization |
.github/workflows/test-opa-policies.yml |
Migrates to use centralized hermit action for initialization |
.github/workflows/updatecli.yml |
Migrates to use centralized hermit action for initialization |
.github/workflows/upgrade-environment.yml |
Migrates to use centralized hermit action for initialization |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
.ci/scripts/retry.sh
Outdated
| if eval "$command"; then | ||
| echo "Command succeeded on attempt $attempt" | ||
| exit 0 | ||
| fi | ||
|
|
||
| exit_code=$? |
Copilot
AI
Dec 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exit code capture at line 41 will always be 0 because the preceding 'if' statement succeeded (even though the command inside it failed). The exit code should be captured immediately after eval on line 36, or the 'if' statement logic should be inverted. Currently, this bug prevents the actual command exit code from being reported when all retries are exhausted.
| if eval "$command"; then | |
| echo "Command succeeded on attempt $attempt" | |
| exit 0 | |
| fi | |
| exit_code=$? | |
| eval "$command" | |
| exit_code=$? | |
| if [ $exit_code -eq 0 ]; then | |
| echo "Command succeeded on attempt $attempt" | |
| exit 0 | |
| fi |
| run: | | ||
| max_retries=5 | ||
| retry_delay=10 | ||
| ./.ci/scripts/retry.sh "$max_retries" "$retry_delay" './bin/hermit env --raw >> "$GITHUB_ENV"' |
Copilot
AI
Dec 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The command passed to retry.sh contains shell redirection ('>> "$GITHUB_ENV"') that will be evaluated in the retry script's context, not in the hermit command's context. This means the output redirection happens after the eval, potentially causing issues. The command should be restructured to ensure the redirection is properly handled within the retry mechanism, or the retry script should be modified to handle output redirection correctly.
| ./.ci/scripts/retry.sh "$max_retries" "$retry_delay" './bin/hermit env --raw >> "$GITHUB_ENV"' | |
| ./.ci/scripts/retry.sh "$max_retries" "$retry_delay" './bin/hermit env --raw' >> "$GITHUB_ENV" |
|
I understand those changes allows us to retry on a single steps, but other steps might also be flaky and we might not want to apply the retry block for all of them. |
Summary of your changes
There is no built-in GitHub Actions mechanism to retry a step on failure. Since this is a relatively simple use case, it is preferable to handle retries with a custom script rather than introducing a third-party action that would require additional version management and maintenance.
This PR introduces a retry script and applies it to the Hermit initialization command to make the CI workflows more resilient to transient failures.