Skip to content

Fix runtime runner provisioning regressions after merged x64 patch#20

Merged
depoll merged 1 commit intomainfrom
fix/runtime-provisioning-regressions
Feb 22, 2026
Merged

Fix runtime runner provisioning regressions after merged x64 patch#20
depoll merged 1 commit intomainfrom
fix/runtime-provisioning-regressions

Conversation

@depoll
Copy link
Copy Markdown
Owner

@depoll depoll commented Feb 22, 2026

Summary

  • fix docker x64 provisioning regression by relying on Docker platform pinning instead of incorrect local image-arch pre-checking
  • prevent reconciler from prematurely deleting runners that are still provisioning (pending/configuring)
  • tolerate archive cleanup races in native runner download flow (ENOENT on unlink)

Validation

  • npm run typecheck -w backend
  • npm run test -w backend -- tests/dockerRunner.test.ts tests/reconciler.test.ts

Note

Medium Risk
Changes affect runner lifecycle/provisioning and reconciliation behavior; mistakes could leave orphaned runners or pull the wrong image architecture, but the diff is small and covered by targeted tests.

Overview
Fixes Docker runner provisioning by removing the architecture-specific retagging/inspection flow in pullRunnerImage and instead relying solely on Docker’s --platform pull, returning the base RUNNER_IMAGE reference.

Prevents the reconciler from deleting runners that are still provisioning by skipping the GitHub existence/orphan check for pending and configuring statuses (with new unit tests for shouldSkipGitHubExistenceCheck).

Hardens native runner downloads by making archive cleanup tolerant of ENOENT on fs.unlink, avoiding failures from concurrent/racy cleanup.

Written by Cursor Bugbot for commit 4900183. This will update automatically on new commits. Configure here.

Avoid false amd64 mismatch failures by relying on Docker platform pinning at container create time, skip reconciler orphan cleanup for provisioning runners, and tolerate missing archive cleanup during concurrent runner lifecycle events.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 22, 2026 09:46
@depoll depoll merged commit eb20e37 into main Feb 22, 2026
7 checks passed
@depoll depoll deleted the fix/runtime-provisioning-regressions branch February 22, 2026 09:47
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes three runtime runner provisioning regressions that were introduced after a previous x64 patch. The changes simplify Docker image handling, prevent premature runner deletion, and handle file cleanup race conditions.

Changes:

  • Simplified Docker image provisioning by trusting Docker's built-in platform pinning instead of manual architecture verification and tagging
  • Added logic to prevent reconciler from deleting runners in pending or configuring states before they register with GitHub
  • Added tolerance for ENOENT errors during archive cleanup to handle race conditions in the native runner download flow

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
backend/src/services/dockerRunner.ts Removed complex architecture verification and tagging logic; now relies on Docker's platform parameter
backend/src/services/reconciler.ts Added shouldSkipGitHubExistenceCheck function to skip GitHub existence validation for provisioning runners
backend/tests/reconciler.test.ts Added unit tests for the new shouldSkipGitHubExistenceCheck function
backend/src/services/runnerManager.ts Added error handling to tolerate ENOENT errors when unlinking archive files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

This is the final PR Bugbot will review for you during this billing cycle

Your free Bugbot reviews will reset on March 17

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

// Newly provisioning runners are expected to not exist in GitHub yet.
if (shouldSkipGitHubExistenceCheck(runner.status)) {
continue;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stuck provisioning runners are never cleaned up

Medium Severity

shouldSkipGitHubExistenceCheck unconditionally skips runners in pending/configuring status with no time-based bound. If provisioning hangs or the creating process crashes without transitioning the status to error, these runners remain in the database indefinitely — the reconciler never cleans them up, since neither the orphan check nor the stale-heartbeat query (which only matches online/busy) covers them.

Fix in Cursor Fix in Web

): Promise<string> {
const d = initDocker();
const platform = `linux/${architecture}`;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exported functions now unused in production code

Low Severity

normalizeImageArchitecture and assertImageArchitecture are still exported but their only production callers were inside pullRunnerImage, which this commit removed. They are now dead code in production, used only by dockerRunner.test.ts.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants