Skip to content

feat: add resource migration job and troubleshooting section for upgrade failures#289

Open
mrMigles wants to merge 2 commits intomainfrom
fix/statefulset_migrator
Open

feat: add resource migration job and troubleshooting section for upgrade failures#289
mrMigles wants to merge 2 commits intomainfrom
fix/statefulset_migrator

Conversation

@mrMigles
Copy link
Copy Markdown
Member

@mrMigles mrMigles commented Mar 9, 2026

Introduced a new resource migration job to handle OpenSearch StatefulSets during upgrades, along with a troubleshooting section in the documentation addressing potential upgrade failures due to pre-deploy migration hooks. This includes detailed descriptions, stack traces, solutions, and recommendations for successful upgrades.

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update

Description

TDB

Related Tickets & Documents

  • Related Issue #
  • Closes #

QA Instructions, Screenshots, Recordings

Please replace this line with instructions on how to test your changes, a note
on the devices and browsers this has been tested on, as well as any relevant
images for UI changes.

Breaking Change checklist

If your PR includes any deployment or processing changes, please utilize this checklist:

  • Does it change any deployment parameters, logic of their working or rename them?
  • Did update from previous version tested with the same set of deployment parameters?

Added/updated tests?

  • Yes
  • No, and this is why: please replace this line with details on why tests
    have not been included
  • I need help with writing tests

[optional] Are there any things to highlight or double check?

[optional] What gif best describes this PR or how it makes you feel?

…ade failures

Introduced a new resource migration job to handle OpenSearch StatefulSets during upgrades, along with a troubleshooting section in the documentation addressing potential upgrade failures due to pre-deploy migration hooks. This includes detailed descriptions, stack traces, solutions, and recommendations for successful upgrades.
@github-actions github-actions bot added the enhancement New feature or request label Mar 9, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7bd3c87855

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

if [ "$NODE_MASTER" -gt 0 ]; then
echo "[resource-migrator] StatefulSet $STS_NAME contains 'node.master' env (OpenSearch 1.x)"
echo "[resource-migrator] Deleting StatefulSet $STS_NAME with --cascade=orphan"
if $KUBECTL -n "$NS" delete statefulset "$STS_NAME" --cascade=orphan --ignore-not-found=true; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid orphaning StatefulSets before remaining hooks run

This hook deletes each matching StatefulSet (--cascade=orphan) during pre-upgrade, but another pre-upgrade hook (migration-job.yaml, weight 10) can still fail with BackoffLimitExceeded (as documented in docs/public/troubleshooting.md), which aborts the upgrade before the chart reapplies StatefulSets. In the upgrade path where node.master is present and the later migration hook fails, the cluster is left running orphaned pods without StatefulSet controllers until a subsequent successful upgrade, which is a high-risk operational state.

Useful? React with 👍 / 👎.

…arch upgrades

Introduced a new section detailing the resource migration job that automatically handles the removal of incompatible OpenSearch 1.x StatefulSets during upgrades to 2.x. This section explains the job's functionality, parameters, and how it integrates with ArgoCD to ensure a smooth upgrade process without manual intervention.
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant