Skip to content

ci: add timeout and fail-fast settings to CI workflow#178

Merged
askpt merged 2 commits intomainfrom
claude/add-timeout-and-fail-fast-ci
Mar 2, 2026
Merged

ci: add timeout and fail-fast settings to CI workflow#178
askpt merged 2 commits intomainfrom
claude/add-timeout-and-fail-fast-ci

Conversation

@Claude
Copy link
Contributor

@Claude Claude AI commented Mar 2, 2026

Adds cost protection and full OS visibility to the cross-platform test matrix with two low-risk optimizations.

Changes

  • timeout-minutes: 15 on build job — caps runaway jobs at 15 minutes instead of GitHub's 6-hour default, protecting against excessive runner costs (especially on 10× more expensive macOS runners)

  • fail-fast: false in matrix strategy — ensures all OS jobs (macOS, Ubuntu, Windows) complete even when one fails, eliminating re-runs needed to diagnose OS-specific vs universal failures

jobs:
  build:
    timeout-minutes: 15
    strategy:
      fail-fast: false
      matrix:
        os: [macos-latest, ubuntu-latest, windows-latest]

Both changes are additive safety measures with no impact on passing workflows.

Original prompt

This section details on the original issue you should resolve

<issue_title>[ci-coach] ci: add timeout and fail-fast settings to CI workflow</issue_title>
<issue_description>### Summary

Two targeted, low-risk CI optimizations that improve cost protection and debugging visibility for the cross-platform test matrix.

Optimizations

1. Job Timeout (timeout-minutes: 15)

Type: Resource Sizing
Impact: Prevents runaway macOS/Windows runners from consuming excess minutes
Risk: Low

Changes:

  • Added timeout-minutes: 15 to the build job

Rationale: VS Code extension tests including compile, lint, and test should complete well under 15 minutes. Without an explicit timeout, the GitHub default is 6 hours — a stuck job on macOS (which costs ~10× more than Ubuntu) could burn significant runner minutes before expiring. A 15-minute cap acts as a cost safety net.

Detailed Analysis

The CI matrix runs on 3 OSes. macOS runner pricing ($0.16/min) is approximately 10× Ubuntu ($0.016/min). A stuck job at the default 6-hour timeout on macOS = ~$57.60 per occurrence. With a 15-minute cap, worst-case exposure is ~$2.40 per occurrence.

Typical VS Code extension CI runs are observed to complete in 3–8 minutes across all platforms.

2. Matrix fail-fast: false

Type: Visibility / Debugging
Impact: All OS jobs complete in a single run — full failure picture without re-running
Risk: Low

Changes:

  • Added fail-fast: false to the matrix strategy

Rationale: The default fail-fast: true cancels all sibling matrix jobs as soon as one fails. For a cross-platform extension this obscures whether a failure is OS-specific or universal, forcing additional re-runs to gather complete data. Setting fail-fast: false ensures all three OS results are always available in a single workflow run, which is especially valuable when triaging tree-sitter native binding failures that can be OS-specific.

Detailed Analysis

With fail-fast: true (current): if the macOS job fails early, Ubuntu and Windows jobs are cancelled. The developer must push another commit or manually re-run to see if the failure is macOS-specific.

With fail-fast: false (proposed): all three jobs complete, giving an immediate complete picture of cross-platform health.

Cost impact of fail-fast: false: negligible — jobs that were previously cancelled still run, but typically only for the remaining minutes of their normal execution time (not the full 15-minute timeout).

Expected Impact

  • Cost protection: Eliminates runaway job cost exposure (6h → 15min timeout cap)
  • Debug efficiency: Eliminates need for extra CI re-runs to see full OS failure matrix
  • Risk Level: Low — both changes are additive safety measures with no functional impact on passing workflows

Testing Recommendations

  • Review workflow YAML syntax
  • Monitor first few runs after merge to confirm jobs complete within timeout
  • Compare runtime data before/after to validate 15-minute bound is sufficient

Generated by CI Optimization Coach

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/ci-coach.md@442992eda2ccb11ee75a39c019ec6d38ae5a84a2
  • expires on Mar 4, 2026, 1:13 PM UTC

[!NOTE]
This was originally intended as a pull request, but the git push operation failed.

Workflow Run: View run details and download patch artifact

The patch file is available in the agent-artifacts artifact in the workflow run linked above.

To create a pull request with the changes:

# Download the artifact from the workflow run
gh run download 22577429898 -n agent-artifacts -D /tmp/agent-artifacts-22577429898

# Create a new branch
git checkout -b ci-coach/optimize-ci-workflow-70b3353349b93821

# Apply the patch (--3way handles cross-repo patches where files may already exist)
git am --3way /tmp/agent-artifacts-22577429898/aw-ci-coach-optimize-ci-workflow.patch

# Push the branch to origin
git push origin ci-coach/optimize-ci-workflow-70b3353349b93821

# Create the pull request
gh pr create --title '[ci-coach] ci: add timeout and fail-fast settings to CI workflow' --base main --head ci-coach/optimize-ci-workflow-70b3353...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

- Fixes askpt/code-metrics#177

Co-authored-by: askpt <2493377+askpt@users.noreply.github.com>
@Claude Claude AI changed the title [WIP] Add timeout and fail-fast settings to CI workflow ci: add timeout and fail-fast settings to CI workflow Mar 2, 2026
@askpt askpt linked an issue Mar 2, 2026 that may be closed by this pull request
4 tasks
@askpt askpt marked this pull request as ready for review March 2, 2026 14:05
@askpt askpt self-requested a review as a code owner March 2, 2026 14:05
Copilot AI review requested due to automatic review settings March 2, 2026 14:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the GitHub Actions CI workflow to improve cost control and cross-platform failure visibility for the existing OS test matrix.

Changes:

  • Added a 15-minute timeout to the build job to cap runaway executions.
  • Set strategy.fail-fast: false so macOS/Ubuntu/Windows matrix jobs all complete even if one fails.

@codecov
Copy link

codecov bot commented Mar 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.17%. Comparing base (48935f9) to head (e411ddd).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #178   +/-   ##
=======================================
  Coverage   59.17%   59.17%           
=======================================
  Files           6        6           
  Lines        1984     1984           
  Branches      143      143           
=======================================
  Hits         1174     1174           
  Misses        810      810           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@askpt askpt merged commit 6a1a90c into main Mar 2, 2026
27 checks passed
@askpt askpt deleted the claude/add-timeout-and-fail-fast-ci branch March 2, 2026 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ci-coach] ci: add timeout and fail-fast settings to CI workflow

3 participants