Skip to content

fix(ci): quote output in repeat_cmd_until to prevent flaky E2E deploys#2678

Open
manaswinidas wants to merge 1 commit intokubeflow:mainfrom
manaswinidas:fix/e2e-deploy-image-check-quoting
Open

fix(ci): quote output in repeat_cmd_until to prevent flaky E2E deploys#2678
manaswinidas wants to merge 1 commit intokubeflow:mainfrom
manaswinidas:fix/e2e-deploy-image-check-quoting

Conversation

@manaswinidas
Copy link
Copy Markdown
Contributor

Description

Fixes a flaky E2E test failure in the "Deploy Model Registry using manifests" step.

During a deployment rollout, when multiple pods temporarily co-exist with different images, the repeat_cmd_until function in scripts/utils.sh receives multi-value output from the jsonpath query (e.g., two image names separated by a newline). The unquoted $output in the [ ] test expands to multiple words, causing a [: too many arguments error. This makes the condition never succeed, so the deploy step spins for the full 500-second timeout before failing.

Fix: Switch from [ $output $condition ] to [[ "$output" $condition ]], which properly handles multi-word values.

Example of the failure (from this CI run, attempt 1):

+ output='ghcr.io/kubeflow/model-registry/server:latest
ghcr.io/kubeflow/model-registry:main-'
+ '[' ghcr.io/kubeflow/model-registry/server:latest ghcr.io/kubeflow/model-registry:main- = ghcr.io/kubeflow/model-registry:main- ']'
././scripts/utils.sh: line 29: [: too many arguments

How Has This Been Tested?

Verified the shell behavior locally:

# Before (fails with multi-word output):
output="val1 val2"; [ $output = val2 ]   # => bash: [: too many arguments

# After (works correctly):
output="val1 val2"; [[ "$output" = val2 ]]  # => no error, evaluates to false as expected

Test Impact

No new tests needed — this fixes CI infrastructure flakiness, not application code.

Assisted-by: Claude (Anthropic)

Made with Cursor

@google-oss-prow google-oss-prow Bot requested review from chambridge and fege May 5, 2026 14:04
@google-oss-prow
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign andreyvelich for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…2E deploys

When multiple pods exist during a rollout, the jsonpath query returns
multiple image values. The unquoted $output in the [ ] test causes
"too many arguments" errors, making the condition never succeed and
the deploy step spin for the full 500s timeout before failing.

Switch to [[ ]] with quoted "$output" to handle multi-value output.

Assisted-by: Claude (Anthropic)
Signed-off-by: manaswinidas <dasmanaswini10@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: manaswinidas <dasmanaswini10@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@manaswinidas manaswinidas force-pushed the fix/e2e-deploy-image-check-quoting branch from 8e8b381 to 29c8911 Compare May 5, 2026 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant