Skip to content

refactor(agents): migrate dependabot AW review to workflow_run trigger#612

Open
katriendg wants to merge 3 commits intomainfrom
chore/aw-stale-ci
Open

refactor(agents): migrate dependabot AW review to workflow_run trigger#612
katriendg wants to merge 3 commits intomainfrom
chore/aw-stale-ci

Conversation

@katriendg
Copy link
Copy Markdown
Collaborator

Description

The aw-dependabot-pr-review agentic workflow used to fire on pull_request_target, which meant the resolver step captured a snapshot of PR Validation while it was still pending or in_progress:*, and the advisory review was posted before the orchestrator ever finished. PR #608 was the canonical example: the review correctly applied the Isaac Sim numpy 2.x ABI guard, but its CI banner quoted a stale in_progress:in_progress conclusion.

This PR migrates the workflow to workflow_run keyed on PR Validation completed, reads the orchestrator's terminal conclusion straight from context.payload.workflow_run.conclusion, and pre-resolves failing per-surface check-runs once in the resolver step. The persona rubric is rewritten to consume those env vars and to map every terminal conclusion explicitly - pending and in_progress:* branches are gone because they are now unreachable.

Related to #579.

Type of Change

  • 🐛 Bug fix (non-breaking change fixing an issue)
  • ✨ New feature (non-breaking change adding functionality)
  • 💥 Breaking change (fix or feature causing existing functionality to change)
  • 📚 Documentation update
  • 🏗️ Infrastructure change (Terraform/IaC)
  • ♻️ Refactoring (no functional changes)

Component(s) Affected

  • infrastructure/terraform/prerequisites/ - Azure subscription setup
  • infrastructure/terraform/ - Terraform infrastructure
  • infrastructure/setup/ - OSMO control plane / Helm
  • workflows/ - Training and evaluation workflows
  • training/ - Training pipelines and scripts
  • docs/ - Documentation

Changes

Workflow trigger and resolver

Switching to workflow_run runs the agent step against the trusted, default-branch copy of the workflow, so the gh-aw compiler can auto-inject fork-PR exclusion and the repository.id guard.

  • Replaced pull_request_target with workflow_run on workflows: ["PR Validation"], types: [completed], branches: ["dependabot/**"]. The branches: filter on workflow_run matches the triggering run's head_branch (not the base), so dependabot/** is the only value that fires for Dependabot PRs — using main here was the #583 regression fixed in #584. The workflow-level if: filters on workflow_run.event == 'pull_request', workflow_run.actor.login == 'dependabot[bot]', and a whitelist of seven terminal conclusions.
  • Kept on.bots: ["dependabot[bot]"] and on.roles: [admin, maintainer, write] at the top level — gh-aw's pre_activation guard checks the triggering actor against on.bots / on.roles independently of the workflow if:, so dropping these would resurrect the #585 / #586 User permission 'none' activation block.
  • Added checks: read to permissions: for server-side check-run enumeration; existing contents, pull-requests, and actions scopes are unchanged.
  • Rewrote the resolve-pr step. It reads context.payload.workflow_run, prefers workflow_run.pull_requests[0], and falls back to search.issuesAndPullRequests keyed on head_sha for the fork case. Both paths re-hydrate via pulls.get so body and draft are reliable.
  • Dropped the previous listWorkflowRunsForRepo lookup. PR_VALIDATION_CONCLUSION now reads directly from run.conclusion, which under types: [completed] is always one of success, failure, cancelled, timed_out, neutral, skipped, or action_required.
  • Added two new env vars exported by the resolver:
    • PR_VALIDATION_FAILING_CHECKS — JSON array of {name, html_url, conclusion} from checks.listForRef(ref=pr.head.sha) filtered to completed non-success/non-neutral/non-skipped runs.
    • PR_BODY — PR body hydrated server-side so the agent does not depend on the integrity-filtered MCP read of the PR.
  • New skip reasons in PR_DEPENDABOT_SKIP_REASON: not-a-pr-run and pr-resolution-failed, alongside the existing not-dependabot / draft.
  • Retargeted safe-outputs:
    • submit-pull-request-review.target${{ env.PR_NUMBER }}
    • add-comment.target${{ env.PR_NUMBER }} (was triggering, which is undefined under workflow_run)
    • create-pull-request-review-comment.target"*"

Persona verdict rubric

The agent now reasons over a final CI signal, so the rubric collapses to a clean terminal-conclusion map.

  • Rewrote the Validation Signal section in .github/agents/dependabot-pr-reviewer.agent.md. The persona is told the workflow runs after PR Validation reaches a terminal conclusion, and is explicitly forbidden from calling checks.listForRef or commits/{sha}/check-runs — it reads PR_VALIDATION_FAILING_CHECKS from the environment instead.
  • Reframed the Surface to Check Run Map as an informational lookup for mapping a failing check name back to its dependency surface. The persona no longer walks it via the API.
  • Rewrote the Verdict Adjustment block as an explicit terminal-conclusion map:
    • success + no static concern + no sticky high-risk trigger → APPROVE-eligible, citing the orchestrator conclusion plus an empty PR_VALIDATION_FAILING_CHECKS.
    • failure | cancelled | timed_out | action_requiredCOMMENT; body MUST quote every entry from PR_VALIDATION_FAILING_CHECKS (name plus html_url).
    • neutral | skipped | unknown or PR_DEPENDABOT_SKIP_REASON == 'pr-resolution-failed'COMMENT with a > [!CAUTION] banner: Deterministic CI signal unavailable ({conclusion}); review is advisory only.
  • Preserved the sticky Isaac Sim ABI guard verbatim — a numpy 2.x bump still keeps the verdict at COMMENT and forces the ⚠️ Maintainer review recommended banner regardless of CI conclusion.

Workflow documentation and lock files

  • Rewrote the Trigger Posture and step-by-step prose in aw-dependabot-pr-review.md to describe the workflow_run execution model, the gh-aw compiler's auto-injected fork-PR exclusion and repository.id guard, and the new env-var contract.
  • Bumped github/gh-aw-actions/setup v0.68.3v0.71.1 in .github/aw/actions-lock.json (SHA ba90f21…239aec4…), picked up by recompilation.
  • Regenerated .github/workflows/aw-dependabot-pr-review.lock.yml via the gh-aw compiler — diff reflects the trigger swap, the new env vars, and the setup-action SHA bump. No hand edits.

Testing Performed

  • Terraform plan reviewed (no unexpected changes)
  • Terraform apply tested in dev environment
  • Training scripts tested locally with Isaac Sim
  • OSMO workflow submitted successfully
  • Smoke tests passed (smoke_test_azure.py)

None of the templated test surfaces apply — this PR only touches .github/agents/ and .github/workflows/. Validation evidence: npm run lint:md and npm run lint:yaml pass on the changed files; the aw-dependabot-pr-review.lock.yml artifact is regenerated rather than hand-edited and matches the gh-aw compiler output for the new source. The behavioural change is observable on the next Dependabot PR — the advisory review will fire after PR Validation completes and quote the orchestrator's terminal conclusion plus any failing per-surface checks.

Documentation Impact

  • No documentation changes needed
  • Documentation updated in this PR
  • Documentation issue filed

Bug Fix Checklist

Not a bug fix — this is a refactor of an agentic-workflow trigger surface.

  • Linked to issue being fixed
  • Regression test included, OR
  • Justification for no regression test:

Checklist

Related Issues

Related to #579

Notes

The min-integrity: approved setting on tools.github is intentionally preserved. The agent's MCP PR-body read is therefore filtered, which is why the resolver hydrates PR_BODY from the REST API server-side — the persona consumes the env var rather than relying on the filtered MCP payload.

  • Lowering min-integrity to unapproved was rejected on prompt-injection grounds; the resolver-side hydration is the chosen mitigation.
  • workflow_run runs in default-branch context, which means changes to the AW workflow itself cannot be exercised by a Dependabot PR — this is the secure-by-design tradeoff documented in the GitHub Security Lab "preventing pwn requests" guide and aligns with the gh-aw workflow_run recommendation.

Follow-up Tasks

  • Validate behaviour on a grouped Dependabot update that produces multiple PR Validation runs against the same head SHA — confirm that only the latest completed run drives the advisory review.
  • After the first live Dependabot PR runs through the new trigger, compare the posted review's CI banner against the orchestrator's final conclusion and the failing-check list to confirm the staleness regression observed in PR security(deps): bump the training-dependencies group across 1 directory with 76 updates #608 is gone.
  • Confirm that safe-outputs.submit-pull-request-review and add-comment post successfully under workflow_run — the target: ${{ env.PR_NUMBER }} overrides are the #588 / #589 mitigation; a Not in pull request context skip in safe_outputs would mean the env var did not resolve.

katriendg and others added 2 commits May 4, 2026 18:31
- Switch trigger from pull_request_target to workflow_run gated on PR Validation completion on main
- Filter on workflow_run.actor.login == 'dependabot[bot]' (replacing pull_request_target bots:/roles: allowlists)
- Hydrate PR_VALIDATION_CONCLUSION from workflow_run payload and PR_VALIDATION_FAILING_CHECKS via checks.listForRef
- Tighten persona verdict rubric so non-success conclusions map to COMMENT with caution banner
- Replace persona check-run API walk with resolver-supplied env vars
- Regenerate aw-dependabot-pr-review.lock.yml

🤖 - Generated by Copilot

Co-authored-by: Copilot <copilot@github.com>
…dabot branches

- change workflow_run branches from main to dependabot/**
- clarify workflow execution context for Dependabot PRs

🔧 - Generated by Copilot
@katriendg katriendg requested a review from a team as a code owner May 5, 2026 06:19
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA f811335.
Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

OpenSSF Scorecard

PackageVersionScoreDetails
actions/actions/github-script 373c709c69115d41ff229c7e5df9f8788daa9553 🟢 7.7
Details
CheckScoreReason
Code-Review🟢 10all changesets reviewed
Binary-Artifacts🟢 10no binaries found in the repo
Maintained🟢 1021 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 10
Packaging⚠️ -1packaging workflow not detected
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions🟢 9detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies⚠️ 1dependency not pinned by hash detected -- score normalized to 1
Fuzzing⚠️ 0project is not fuzzed
License🟢 10license file detected
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 9security policy file detected
SAST🟢 10SAST tool is run on all commits
Branch-Protection🟢 5branch protection is not maximal on development and all release branches
actions/github/gh-aw-actions/setup 239aec45b78c8799417efdd5bc6d8cc036629ec1 UnknownUnknown

Scanned Files

  • .github/workflows/aw-dependabot-pr-review.lock.yml

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.27%. Comparing base (10ab980) to head (f811335).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #612      +/-   ##
==========================================
- Coverage   77.27%   77.27%   -0.01%     
==========================================
  Files         272      272              
  Lines       18140    18137       -3     
  Branches     2452     2467      +15     
==========================================
- Hits        14018    14015       -3     
  Misses       3698     3698              
  Partials      424      424              
Flag Coverage Δ *Carryforward flag
pester 83.13% <ø> (ø) Carriedforward from db20d20
pytest-data-pipeline 100.00% <ø> (ø) Carriedforward from db20d20
pytest-dataviewer 93.60% <ø> (-0.01%) ⬇️ Carriedforward from db20d20
pytest-dm-tools 100.00% <ø> (ø) Carriedforward from db20d20
pytest-evaluation 99.51% <ø> (ø)
pytest-fuzz 4.89% <ø> (+<0.01%) ⬆️ Carriedforward from db20d20
pytest-inference 100.00% <ø> (ø) Carriedforward from db20d20
pytest-training 93.32% <ø> (ø) Carriedforward from db20d20
vitest 53.02% <ø> (ø) Carriedforward from db20d20

*This pull request uses carry forward flags. Click here to find out more.
see 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants