Add AI visual regression tester via GitHub Agentic Workflows by Copilot · Pull Request #811 · OmniBlocks/scratch-gui

Copilot · 2026-02-26T15:02:38Z

Created .github/workflows/ai-visual-regression.md agentic workflow
Uses Playwright MCP tool (not CLI) for browser navigation and screenshots
Added "sleep *" and "seq *" to bash allowlist so the health-check polling loop can run
Added discussions: false to safe-outputs.add-comment to drop unnecessary discussions: write permission (least privilege)
Compiled to .github/workflows/ai-visual-regression.lock.yml (via gh aw compile)

Original prompt

This section details on the original issue you should resolve

<issue_title>AI visual regression tester</issue_title>
<issue_description>clearly the stupid visual reggression tester that is supposed to test new react components and what not is absolutely vibe coded garbage so lets vibe code a less garbagy workflow using AI. in the regular static workflow the logic for traversing the GUI to get the point where it can screenshot the added component/button/ etc. is completely broken. half, if not all of the images that are returned in the comment are always broken. but now github has introduced agentic workflows which allow us to use AI to do workflows. these have an mcp server and when used with vision models they can theoretically see screenshots and know more than theregular static workflow. it can also click through the buttons more intelligenly and look at the diff to see what has been added. However, these cost premium requests for me, so:

They should only run on PRs made by members of the org

It should not run on draft PRs

It should see if the actual UI code even changed at all before going to the agent job

Should not run if another one is alredy running

</issue_description>

<agent_instructions>make sure to read this discussion (and my comments/questions on it https://github.com/orgs/community/discussions/186451

also make sure to read this https://github.com/github/gh-aw/blob/main/create.md
</agent_instructions>

Comments on the Issue (you are @copilot in this section)

Fixes AI visual regression tester #810

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: supervoidcoder <88671013+supervoidcoder@users.noreply.github.com>

supervoidcoder · 2026-02-26T15:22:03Z

@copilot are you stupid why do you think i sent you that discussion link ... sighh look at this thread:

[supervoidcoder](https://github.com/supervoidcoder)
[2 weeks ago](https://github.com/orgs/community/discussions/186451#discussioncomment-15807315)
I wish it could use a free model so it wouldn’t use up a whole premium request (I’m not sure if it does, but I see it uses Claude sonnet 4.5 which usually takes up 1x request, so correct me if I’m wrong) for repetitive small tasks. For example the CI Doctor, we run tests on every commit. Would the only solution be to just make tests run weekly?

Also, I wish copilot agent could run on windows runners too. This way stuff like the cli test tool could run on my projects like [win-witr](https://github.com/supervoidcoder/win-witr) and other cli tools which are very often Windows only since I know Windows best.

5 replies
@[pelikhan](https://github.com/pelikhan)
pelikhan
[4 days ago](https://github.com/orgs/community/discussions/186451#discussioncomment-15890139)
You can use any model available in your subscription by using the engine.model front matter field.

@[pelikhan](https://github.com/pelikhan)
pelikhan
[4 days ago](https://github.com/orgs/community/discussions/186451#discussioncomment-15890147)
You can run your tests on a windows runner in a job, store the artifacts, then analyze them in the agent job.

Custom jobs can be added under the jobs with the usual actions syntax.

@[supervoidcoder](https://github.com/supervoidcoder)
supervoidcoder
[2 days ago](https://github.com/orgs/community/discussions/186451#discussioncomment-15915159)
Wait, @pelikhan , I've got another cool idea. I heard that it has playwright on the MCP right? Well, if I use a vision model, like Gemini 3 (which should be allowed since I saw it it in the VS Code model picker with my github copilot pro subscription) then it can be able to SEE my web app, right? That would be really cool. Now I can get it to say if my PR broke a button or mangled the UI. I've been using regular github actions using pixel match, and it keeps yelling at me half the time due to a 0.01% change which happens to be a stray pixel in either screenshot 😆

@[eaftan](https://github.com/eaftan)
eaftan
[14 hours ago](https://github.com/orgs/community/discussions/186451#discussioncomment-15928633)
Yeah, you can definitely do that, and we do similar things ourselves!

https://github.github.com/gh-aw/reference/tools/#playwright-tool-playwright
https://github.com/github/gh-aw/blob/main/.github/workflows/unbloat-docs.md

@[supervoidcoder](https://github.com/supervoidcoder)
supervoidcoder
[14 hours ago](https://github.com/orgs/community/discussions/186451#discussioncomment-15928815)
Yeah, you can definitely do that, and we do similar things ourselves!

https://github.github.com/gh-aw/reference/tools/#playwright-tool-playwright
https://github.com/github/gh-aw/blob/main/.github/workflows/unbloat-docs.md

Yay, thanks!

so look at the unbloat docs md. you will see that AGENTIC WORKFLOWS ALREADY COMES WITH PLAYWRIGHT TOOL AND IT SHOULDN'T USE CLI

Co-authored-by: supervoidcoder <88671013+supervoidcoder@users.noreply.github.com>

Copilot · 2026-02-26T15:28:21Z

@copilot are you stupid why do you think i sent you that discussion link ... sighh look at this thread:

...

Fixed in d4e42cf. The workflow now uses the Playwright MCP tool directly — the agent calls Playwright to navigate URLs and take screenshots (saved automatically to /tmp/gh-aw/mcp-logs/playwright/). Bash is only used for npm build and npx http-server to serve the app. No npx playwright CLI anywhere.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

github.github.com
- Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node /home/REDACTED/work/_temp/ghcca-node/node/bin/node --enable-source-maps /home/REDACTED/work/_temp/copilot-developer-action-main/dist/index.js (dns block)
https://api.github.com/repos/actions/checkout/git/ref/tags/v4
- Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v4 --jq .object.sha (http block)
https://api.github.com/repos/actions/github-script/git/ref/tags/v8
- Triggering command: /usr/bin/gh gh api /repos/actions/github-script/git/ref/tags/v8 --jq .object.sha (http block)
https://api.github.com/repos/github/gh-aw/git/ref/tags/v0.50.4
- Triggering command: /usr/bin/gh gh api /repos/github/gh-aw/git/ref/tags/v0.50.4 --jq .object.sha (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

supervoidcoder

lgtm

github-actions · 2026-02-26T16:36:36Z

🤖 Auto-Component Test Results

🎉 No new UI components detected!

This PR doesn't add any new buttons, menus, or clickable components.

Triggered by Copilot • Commit d4e42cf

github-actions · 2026-02-26T16:36:43Z

📝 Mega PR Test Skipped

Hi @Copilot! The comprehensive Mega PR test was skipped because this PR only changes documentation or configuration files (.md, .txt, .yml, .yaml).

To force the Mega PR test to run, comment /test pr on this PR.

github-actions · 2026-02-26T16:37:42Z

🎬 Visual Regression Testing - Skipped

Hi @Copilot! VRT was skipped because this PR only changes documentation or configuration files (.md, .txt, .yml).

To force VRT to run, comment /vrt on this PR.

Copilot

Pull request overview

Adds a new GitHub Agentic Workflow to perform AI-assisted visual regression checks on UI-related pull requests, replacing the existing brittle static visual regression approach.

Changes:

Introduces a human-editable gh-aw workflow source that builds the app, drives UI navigation via Playwright MCP, and reports screenshot comparisons.
Adds the compiled .lock.yml workflow that implements gating (org roles, non-draft, concurrency) and runs the Copilot-based agent with safe outputs and cache-memory baseline persistence.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`.github/workflows/ai-visual-regression.md`	Defines the agentic workflow frontmatter (gating, tool allowlist, cache-memory) and the agent prompt (diff analysis → build → serve → screenshot → baseline compare → report).
`.github/workflows/ai-visual-regression.lock.yml`	Compiled GitHub Actions workflow generated by `gh aw compile`, including activation/pre-activation gating, agent execution, safe-outputs comment posting, and cache-memory persistence.

Copilot · 2026-02-26T16:42:16Z

.github/workflows/ai-visual-regression.md

+    - "ls *"
+    - "echo *"
+    - "kill *"
+    - "curl *"


The prompt’s health-check loop uses seq and sleep, but those commands aren’t allowed in the tools.bash allowlist. As-is, the agent will likely be blocked when following Step 3. Either add "sleep *" (and "seq *", if keeping this exact loop) to tools.bash, or change the polling example to avoid those commands (e.g., brace expansion plus a Playwright/Node-based wait).

Suggested change

- "curl *"

- "curl *"

- "sleep *"

- "seq *"

Copilot · 2026-02-26T16:42:16Z

.github/workflows/ai-visual-regression.md

+safe-outputs:
+  add-comment:
+    max: 1
+    hide-older-comments: true


The safe-outputs add-comment tool causes the compiled workflow to request discussions: write permission, which appears unnecessary for a PR-only reporter. Consider setting safe-outputs.add-comment.discussions: false (per the tool’s documented option) so the workflow runs with least privilege while still being able to comment on PRs/issues.

Suggested change

hide-older-comments: true

hide-older-comments: true

discussions: false

Copilot · 2026-02-26T16:42:17Z

.github/workflows/ai-visual-regression.lock.yml

+            const allowedExtensions = [".png",".json"];
+            const result = validateMemoryFiles('/tmp/gh-aw/cache-memory', 'cache', allowedExtensions);
+            if (!result.valid) {
+              core.setFailed(`File type validation failed: Found $${result.invalidFiles.length} file(s) with invalid extensions. Only .png, .json are allowed.`);


Minor bug in the failure message template: the string interpolation uses $${result.invalidFiles.length} which will render an extra $ (e.g., Found $3 file(s) ...). This should be ${result.invalidFiles.length} like the similar step later in the file, otherwise the error output is confusing when file type validation fails.

Suggested change

core.setFailed(`File type validation failed: Found $${result.invalidFiles.length} file(s) with invalid extensions. Only .png, .json are allowed.`);

core.setFailed(`File type validation failed: Found ${result.invalidFiles.length} file(s) with invalid extensions. Only .png, .json are allowed.`);

you're not supposed to edit the compiled yml

supervoidcoder · 2026-02-26T19:04:53Z

@copilot apply changes based on the comments in this thread

(don't edit the compiled file directly) don't forget to recompile

…ermission Co-authored-by: supervoidcoder <88671013+supervoidcoder@users.noreply.github.com>

Copilot · 2026-02-26T19:09:30Z

@copilot apply changes based on the comments in this thread

(don't edit the compiled file directly) don't forget t...

Done in 7e2cdcb:

Added "sleep *" and "seq *" to the bash allowlist so the health-check polling loop works
Added discussions: false to safe-outputs.add-comment to drop the unnecessary discussions: write permission
Recompiled the lockfile

Skipped the $${...} fix in the lockfile per the comment thread — that's generated code.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

https://api.github.com/repos/actions/checkout/git/ref/tags/v4
- Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v4 --jq .object.sha (http block)
https://api.github.com/repos/actions/github-script/git/ref/tags/v8
- Triggering command: /usr/bin/gh gh api /repos/actions/github-script/git/ref/tags/v8 --jq .object.sha (http block)
https://api.github.com/repos/github/gh-aw/git/ref/tags/v0.50.5
- Triggering command: /usr/bin/gh gh api /repos/github/gh-aw/git/ref/tags/v0.50.5 --jq .object.sha (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Initial plan

fff6da9

Copilot AI assigned Copilot and supervoidcoder Feb 26, 2026

github-project-automation bot added this to OmniBlocks Feb 26, 2026

Copilot started work on behalf of supervoidcoder February 26, 2026 15:02 View session

Add AI visual regression tester using GitHub Agentic Workflows

916d23b

Co-authored-by: supervoidcoder <88671013+supervoidcoder@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add AI visual regression tester for React components~~ Add AI visual regression tester via GitHub Agentic Workflows Feb 26, 2026

Copilot finished work on behalf of supervoidcoder February 26, 2026 15:19

Copilot started work on behalf of supervoidcoder February 26, 2026 15:22 View session

fix: use Playwright MCP tool instead of CLI for screenshots

d4e42cf

Co-authored-by: supervoidcoder <88671013+supervoidcoder@users.noreply.github.com>

Copilot finished work on behalf of supervoidcoder February 26, 2026 15:29

supervoidcoder approved these changes Feb 26, 2026

View reviewed changes

supervoidcoder marked this pull request as ready for review February 26, 2026 16:35

Copilot AI review requested due to automatic review settings February 26, 2026 16:35

Copilot started reviewing on behalf of supervoidcoder February 26, 2026 16:36 View session

Copilot AI reviewed Feb 26, 2026

View reviewed changes

Copilot started work on behalf of supervoidcoder February 26, 2026 19:05 View session

fix: add sleep/seq to bash allowlist and disable discussions: write p…

7e2cdcb

…ermission Co-authored-by: supervoidcoder <88671013+supervoidcoder@users.noreply.github.com>

Copilot finished work on behalf of supervoidcoder February 26, 2026 19:10

supervoidcoder merged commit 04424c4 into main Feb 26, 2026

github-project-automation bot moved this to Done in OmniBlocks Feb 26, 2026

	hide-older-comments: true
	hide-older-comments: true
	discussions: false

	core.setFailed(`File type validation failed: Found $${result.invalidFiles.length} file(s) with invalid extensions. Only .png, .json are allowed.`);
	core.setFailed(`File type validation failed: Found ${result.invalidFiles.length} file(s) with invalid extensions. Only .png, .json are allowed.`);

Conversation

Copilot AI commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comments on the Issue (you are @copilot in this section)

Uh oh!

supervoidcoder commented Feb 26, 2026

Uh oh!

Copilot AI commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

supervoidcoder left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤖 Auto-Component Test Results

🎉 No new UI components detected!

Uh oh!

github-actions bot commented Feb 26, 2026

📝 Mega PR Test Skipped

Uh oh!

github-actions bot commented Feb 26, 2026

🎬 Visual Regression Testing - Skipped

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

supervoidcoder Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

supervoidcoder commented Feb 26, 2026

Uh oh!

Copilot AI commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Feb 26, 2026 •

edited

Loading

Copilot AI commented Feb 26, 2026 •

edited

Loading

github-actions bot commented Feb 26, 2026 •

edited

Loading

Copilot AI commented Feb 26, 2026 •

edited

Loading