ci: add agentic PR review checks for packages/react#3764
ci: add agentic PR review checks for packages/react#3764eliseo-juan wants to merge 3 commits intomainfrom
Conversation
Add 4 parallel AI-powered review agents that run on every PR affecting packages/react/, mirroring Factorial's agentic-checks pattern adapted to the f0 quality gate areas: - Code Review: F0 component conventions, TS strictness, exports, styling - Accessibility Review: WCAG 2.1/2.2 AA, focusRing, ARIA, keyboard - Storybook Review: story existence, Meta conventions, Snapshot, prop coverage - Test Coverage Review: test file existence, zeroRender, behavioral coverage All checks are advisory-only (never block CI) and post findings as PR comments. Uses the same trust model as Factorial (base-branch script/prompt copy to prevent prompt injection).
There was a problem hiding this comment.
Pull request overview
Adds an “agentic checks” GitHub Actions workflow that runs parallel Copilot-powered review agents on PRs affecting packages/react/**, using a base-branch trust model to prevent prompt/script tampering from the PR head.
Changes:
- Introduces
.github/workflows/agentic-checks.ymlmatrix workflow to run 4 review agents (code, a11y, Storybook, test coverage). - Adds
.github/scripts/agentic-check.shto fetch the PR diff, run Copilot CLI, parse a structured verdict, and upsert a PR comment. - Adds 4 agent prompt markdown files under
.github/agent-prompts/.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/agentic-checks.yml | New PR-triggered workflow with matrix agents + trust-model checkout/restore. |
| .github/scripts/agentic-check.sh | New runner script to execute Copilot agent on PR diff and post/update PR comments. |
| .github/agent-prompts/code-review.md | Prompt for code conventions and TS strictness checks. |
| .github/agent-prompts/a11y-review.md | Prompt for WCAG AA accessibility review of PR diff. |
| .github/agent-prompts/storybook-review.md | Prompt for Storybook story coverage and conventions review. |
| .github/agent-prompts/test-coverage-review.md | Prompt for unit test coverage expectations review. |
.github/scripts/agentic-check.sh
Outdated
| set +e | ||
| copilot \ | ||
| --model "${MODEL}" \ | ||
| --allow-all-tools \ | ||
| --add-dir . \ | ||
| --add-dir /tmp \ | ||
| --no-ask-user \ | ||
| --prompt "$(cat /tmp/prepared-prompt.md)" \ | ||
| 2>&1 | tee "${AGENT_OUTPUT_FILE}" | ||
| AGENT_EXIT_CODE=$? |
There was a problem hiding this comment.
The Copilot process inherits the workflow environment, including GH_TOKEN (and potentially other secrets). Since the agent output is later posted as a PR comment, this creates a direct secret-exfiltration path. Consider unsetting non-required secrets for the copilot invocation (e.g., run Copilot with a minimal env, and keep GH_TOKEN scoped only to the gh calls) to reduce blast radius.
| set +e | |
| copilot \ | |
| --model "${MODEL}" \ | |
| --allow-all-tools \ | |
| --add-dir . \ | |
| --add-dir /tmp \ | |
| --no-ask-user \ | |
| --prompt "$(cat /tmp/prepared-prompt.md)" \ | |
| 2>&1 | tee "${AGENT_OUTPUT_FILE}" | |
| AGENT_EXIT_CODE=$? | |
| # Run Copilot with a minimal environment to avoid exposing workflow secrets | |
| set +e | |
| env -i \ | |
| PATH="${PATH}" \ | |
| HOME="${HOME:-/tmp}" \ | |
| COPILOT_GITHUB_TOKEN="${COPILOT_GITHUB_TOKEN}" \ | |
| copilot \ | |
| --model "${MODEL}" \ | |
| --allow-all-tools \ | |
| --add-dir . \ | |
| --add-dir /tmp \ | |
| --no-ask-user \ | |
| --prompt "$(cat /tmp/prepared-prompt.md)" \ | |
| 2>&1 | tee "${AGENT_OUTPUT_FILE}" | |
| AGENT_EXIT_CODE=${PIPESTATUS[0]} |
.github/scripts/agentic-check.sh
Outdated
| if [[ ${AGENT_EXIT_CODE} -ne 0 ]]; then | ||
| echo "::warning::Copilot agent exited with code ${AGENT_EXIT_CODE}" | ||
|
|
||
| SUMMARY="### ${CHECK_EMOJI} ${CHECK_NAME}: Agent Error | ||
|
|
||
| The review agent encountered an error (exit code ${AGENT_EXIT_CODE}). | ||
| This is an advisory check — it does not block the PR. | ||
|
|
||
| <details> | ||
| <summary>Agent output</summary> | ||
|
|
||
| \`\`\` | ||
| $(redact_secrets "${AGENT_OUTPUT}" | head -100) | ||
| \`\`\` | ||
|
|
||
| </details>" | ||
|
|
||
| echo "${SUMMARY}" >> "${GITHUB_STEP_SUMMARY:-/dev/null}" | ||
| # Advisory mode: always exit 0 | ||
| exit 0 | ||
| fi | ||
|
|
||
| # ----------------------------------------------------------------------------- | ||
| # Parse verdict | ||
| # ----------------------------------------------------------------------------- | ||
|
|
||
| VERDICT_LINE=$(grep -oP '<!-- VERDICT: \K.*?(?= -->)' "${AGENT_OUTPUT_FILE}" | tail -1 || true) | ||
|
|
||
| if [[ -z "${VERDICT_LINE}" ]]; then | ||
| echo "::warning::No VERDICT marker found in agent output" | ||
|
|
||
| SUMMARY="### ${CHECK_EMOJI} ${CHECK_NAME}: No Verdict | ||
|
|
||
| The agent completed but did not produce a structured verdict. | ||
| This is an advisory check — it does not block the PR. | ||
|
|
||
| <details> | ||
| <summary>Agent output (last 50 lines)</summary> | ||
|
|
||
| \`\`\` | ||
| $(redact_secrets "${AGENT_OUTPUT}" | tail -50) | ||
| \`\`\` | ||
|
|
||
| </details>" | ||
|
|
||
| echo "${SUMMARY}" >> "${GITHUB_STEP_SUMMARY:-/dev/null}" | ||
| # Advisory mode: always exit 0 | ||
| exit 0 | ||
| fi |
There was a problem hiding this comment.
When the agent errors or produces no VERDICT marker, the script writes to the step summary but does not upsert a PR comment. This makes the check easy to miss for reviewers (and diverges from the PR description’s “findings are posted as PR comments” behavior). Consider posting an upserted PR comment for these cases too (agent error / no verdict / skipped) so the PR always shows a visible result.
|
|
||
| - name: Restore trusted script and prompt | ||
| if: steps.copy-trusted.outputs.trusted == 'true' | ||
| run: | |
There was a problem hiding this comment.
The restore step cps into .github/scripts/ and into $PROMPT_FILE without ensuring the destination directories exist. If a PR removes/renames these directories (while still touching packages/react/** so the workflow triggers), this step will fail before the check runs. Add a mkdir -p for the target directories (or use install -D) before copying to make the workflow robust.
| run: | | |
| run: | | |
| mkdir -p .github/scripts | |
| mkdir -p "$(dirname "$PROMPT_FILE")" |
.github/scripts/agentic-check.sh
Outdated
| if [[ "${PASS}" == "True" ]]; then | ||
| echo "PASSED" | ||
|
|
||
| SUMMARY="### ${CHECK_EMOJI} ${CHECK_NAME}: Passed | ||
|
|
||
| ${VERDICT_SUMMARY}" | ||
|
|
||
| echo "${SUMMARY}" >> "${GITHUB_STEP_SUMMARY:-/dev/null}" | ||
|
|
||
| # Post a success comment to the PR (so reviewers can see it ran) | ||
| post_pr_comment "${SUMMARY}" | ||
| else |
There was a problem hiding this comment.
The "Passed" path posts VERDICT_SUMMARY back to the PR without running it through redact_secrets(). If the agent output ever includes credentials (accidentally or via prompt-injection), this would leak secrets into PR comments. Apply redaction (or stronger: blocklist/strip known env vars) consistently before posting comments in all paths, including the pass case.
Remove duplicated rules from agent prompts. Each prompt now only defines the agent role, scope rules, which skills to load, and the verdict format. All review criteria live in the skills (f0-code-review, a11y, f0-storybook-stories, f0-storybook-testing, f0-unit-testing, f0-quality-gate) as the single source of truth. Reduces total prompt lines from 411 to 189 (-54%).
- Switch from gh CLI to curl+jq for API calls - Switch from grep -oP (Perl regex) to POSIX sed for VERDICT parsing - Switch from Python to jq for JSON parsing - Use -p flag for copilot CLI invocation - Use PIPESTATUS[0] for exit code capture - Add jq empty VERDICT JSON validation - Never post raw agent output in PR comments (security) - On pass: only write to job summary, skip PR comment - On fail: post redacted summary + link to Actions logs - Add ::group:: log grouping for better Actions UI - Move COPILOT_GITHUB_TOKEN to step-level env (limit secret scope) - Add Repository last commit diagnostic step - Keep advisory mode (always exit 0) as intentional difference
| if [ -f .github/scripts/agentic-check.sh ] && [ -f "$PROMPT_FILE" ]; then | ||
| cp .github/scripts/agentic-check.sh /tmp/agentic-check.sh | ||
| cp "$PROMPT_FILE" /tmp/prompt.md | ||
| echo "trusted=true" >> "$GITHUB_OUTPUT" | ||
| echo "Using trusted script and prompt from base branch" |
There was a problem hiding this comment.
The trust-model copy step skips the check whenever the script/prompt don’t exist on the base SHA. Since this PR adds those files for the first time, the workflow will be a no-op on this PR (and any other PR until it’s merged), which conflicts with the stated goal of running these checks on PRs. Consider documenting this bootstrap behavior explicitly, or adding a one-time safe bootstrap path so the initial PR can be validated.
| curl -sS -X POST \ | ||
| -H "Authorization: Bearer ${GH_TOKEN}" \ | ||
| -H "Accept: application/vnd.github+json" \ | ||
| -H "Content-Type: application/json" \ | ||
| "${GITHUB_API}/repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/comments" \ |
There was a problem hiding this comment.
PR description mentions “Idempotent comments … upsert instead of posting duplicates”, but the script only ever creates a new comment via POST and doesn’t search/update an existing one. This will produce duplicate comments on every rerun/re-push. Implement an upsert flow (find existing comment by hidden marker keyed by the agent id and update it), or adjust the PR description/expectations.
| payload=$(jq -n --arg body "${body}" '{"body": $body}') | ||
| curl -sS -X POST \ | ||
| -H "Authorization: Bearer ${GH_TOKEN}" \ | ||
| -H "Accept: application/vnd.github+json" \ | ||
| -H "Content-Type: application/json" \ | ||
| "${GITHUB_API}/repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/comments" \ | ||
| -d "${payload}" > /dev/null |
There was a problem hiding this comment.
Despite “advisory mode” intent, post_pr_comment runs jq/curl under set -e without any error handling. If the API returns 403/429 or curl fails, the script will exit non-zero and the workflow will fail (becoming blocking). Make comment posting non-fatal (capture HTTP status / || true) and surface failures as warnings while still exiting 0.
| payload=$(jq -n --arg body "${body}" '{"body": $body}') | |
| curl -sS -X POST \ | |
| -H "Authorization: Bearer ${GH_TOKEN}" \ | |
| -H "Accept: application/vnd.github+json" \ | |
| -H "Content-Type: application/json" \ | |
| "${GITHUB_API}/repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/comments" \ | |
| -d "${payload}" > /dev/null | |
| # Build JSON payload; failures should not break the workflow. | |
| if ! payload=$(jq -n --arg body "${body}" '{"body": $body}'); then | |
| echo "::warning::Failed to build PR comment payload with jq; skipping PR comment." | |
| return 0 | |
| fi | |
| # Post the comment; capture HTTP status and treat failures as advisory-only. | |
| local http_status | |
| if ! http_status=$(curl -sS -o /dev/null -w "%{http_code}" \ | |
| -X POST \ | |
| -H "Authorization: Bearer ${GH_TOKEN}" \ | |
| -H "Accept: application/vnd.github+json" \ | |
| -H "Content-Type: application/json" \ | |
| "${GITHUB_API}/repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/comments" \ | |
| -d "${payload}"); then | |
| echo "::warning::curl failed while posting PR comment; skipping PR comment." | |
| return 0 | |
| fi | |
| if [[ "${http_status}" -lt 200 || "${http_status}" -ge 300 ]]; then | |
| echo "::warning::Failed to post PR comment (HTTP status ${http_status}); continuing without blocking CI." | |
| fi |
| set +e | ||
| copilot \ | ||
| -p "${PROMPT}" \ | ||
| --model "${MODEL}" \ | ||
| --allow-all-tools \ | ||
| --add-dir . \ | ||
| --add-dir /tmp \ | ||
| --no-ask-user \ | ||
| 2>&1 | tee "${OUTPUT_FILE}" |
There was a problem hiding this comment.
The Copilot run uses --allow-all-tools while the script also notes the agent inherits environment variables (including secrets). Even if you don’t post raw output to PR comments, a prompt-injected agent can still exfiltrate secrets over the network if shell/network tools are available. Tighten the tool permission model (allow only the minimum needed), and/or run the copilot process with a reduced environment (don’t expose GH_TOKEN, and use minimal-scope tokens).
| set +e | |
| copilot \ | |
| -p "${PROMPT}" \ | |
| --model "${MODEL}" \ | |
| --allow-all-tools \ | |
| --add-dir . \ | |
| --add-dir /tmp \ | |
| --no-ask-user \ | |
| 2>&1 | tee "${OUTPUT_FILE}" | |
| # To reduce exposure, run Copilot in a subshell with GH_TOKEN removed so tools | |
| # invoked by the agent cannot access the GitHub token directly. | |
| set +e | |
| ( | |
| unset GH_TOKEN | |
| copilot \ | |
| -p "${PROMPT}" \ | |
| --model "${MODEL}" \ | |
| --allow-all-tools \ | |
| --add-dir . \ | |
| --add-dir /tmp \ | |
| --no-ask-user | |
| ) 2>&1 | tee "${OUTPUT_FILE}" |
| name: ${{ matrix.emoji }} ${{ matrix.name }} | ||
| permissions: | ||
| contents: read | ||
| pull-requests: write |
There was a problem hiding this comment.
The workflow only grants pull-requests: write, but the script posts PR comments via the Issues Comments REST endpoint (/repos/.../issues/{PR_NUMBER}/comments). This typically requires issues: write; otherwise the comment POST will 403 and the check won’t report anything back to the PR. Add issues: write in the job permissions (or switch to a PR-review endpoint that matches the granted permissions).
| pull-requests: write | |
| pull-requests: write | |
| issues: write |
Description
Add 4 parallel AI-powered review agents that run on every PR affecting
packages/react/, mirroring Factorial's agentic-checks pattern adapted to the f0 quality gate areas defined inpackages/react/.skills/f0-quality-gate/SKILL.md.Implementation details
4 Review Agents
.github/agent-prompts/code-review.md.github/agent-prompts/a11y-review.md.github/agent-prompts/storybook-review.md.github/agent-prompts/test-coverage-review.mdMapping from f0-quality-gate skill
f0-code-review)code-reviewa11y)a11y-reviewf0-storybook-stories+f0-storybook-testing)storybook-reviewf0-unit-testing)test-coverage-reviewArchitecture
/tmp/, then checks out PR head and restores trusted files. Prevents prompt injection via malicious PRs.packages/react/**changes.@github/copilot@1.0.7CLI withclaude-opus-4.6model.Files Added
Required Setup
A repo admin needs to create the
AGENTIC_CHECKSsecret with a Copilot-enabled GitHub token before the workflow can run.