Skip to content

Conversation

dt
Copy link
Member

@dt dt commented Aug 20, 2025

Backport 1/1 commits from #151950.

/cc @cockroachdb/release


Previously this could queue up a large number of small changes to flush, particularly if flushing was slower than updates, potentially causing BACKUP to hang for hours as it drains this queue (of only marginally useful information, in the case of a otherwise completed job that is just writing out the debug info).

Now all the updates that arrive over a 15s window are rolled up before being saved. If the channel becomes full while saving, additional messages may be dropped.

Release note: none.
Epic: none.
Release justification: fixes severe bug that could stall backups for hours.

Previously this could queue up a large number of small changes to flush,
particularly if flushing was slower than updates. Now all the updates
that arrive over a 15s window are rolled up before being saved. If the channel
becomes full while saving, additional messages may be dropped.

Release note: none.
Epic: none.
@dt dt requested review from a team as code owners August 20, 2025 23:02
@dt dt requested review from jeffswenson and removed request for a team August 20, 2025 23:02
Copy link

blathers-crl bot commented Aug 20, 2025

Thanks for opening a backport.

Before merging, please confirm that it falls into one of the following categories (select one):

  • Non-production code changes. Includes test-only changes, build system changes, etc.
  • Fixes for serious issues. Defined in the policy as correctness, stability, or security issues, data corruption/loss, significant performance regressions, breaking working and widely used functionality, or an inability to detect and debug production issues.
  • Other approved changes. These changes must be gated behind a disabled-by-default feature flag unless there is a strong justification not to.

Add a brief release justification to the PR description explaining your selection.

Also, confirm that the change does not break backward compatibility and complies with all aspects of the backport policy.

All backports must be reviewed by the TL and EM for the owning area.

@blathers-crl blathers-crl bot added the backport Label PR's that are backports to older release branches label Aug 20, 2025
Copy link

blathers-crl bot commented Aug 20, 2025

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link

blathers-crl bot commented Aug 20, 2025

✅ PR #152214 is compliant with backport policy

Confidence: high
Critical bug criteria met: [Significant performance regressions]
Backward compatible: true
Explanation: The changes in this pull request address a severe bug that could stall backups for hours by managing per-node progress updates more efficiently. This justifies the exemption from the standard backport policy requirements as this fix clearly meets the critical bug criteria of preventing significant performance regressions. According to the policy, a bug is considered critical if it involves significant performance regressions. The PR does specifically focus on preventing a backup operation from hanging, which aligns with performance improvement and stability enhancement.

The code changes modify the per-node progress flushing logic in the backup job execution, ensuring that updates are accumulated and written less frequently, thereby preventing queuing issues and potential stalls. The modification is local and specific, and the logic is gated by a 15-second time window rather than being triggered by each progress update, which is a clear and targeted fix to the reported issue. No new settings or feature flags are introduced, and there are no backward compatibility concerns.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@msbutler msbutler merged commit 1cac24f into cockroachdb:release-24.3 Aug 21, 2025
16 checks passed
@dt dt deleted the backport24.3-151950 branch August 21, 2025 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Label PR's that are backports to older release branches target-release-24.3.20
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants