Skip to content

Add structural validation checks ported from dotnet/skills#19

Open
danmoseley wants to merge 3 commits intodotnet:mainfrom
danmoseley:port-validation-checks
Open

Add structural validation checks ported from dotnet/skills#19
danmoseley wants to merge 3 commits intodotnet:mainfrom
danmoseley:port-validation-checks

Conversation

@danmoseley
Copy link
Copy Markdown
Member

@danmoseley danmoseley commented Mar 27, 2026

Port 5 static (non-AI) validation checks from dotnet/skills to pr-validation.yml. All checks run as inline PowerShell with no tokens or secrets, so they work on fork PRs.

New checks

Check Severity What it catches
Reference scanning Warning URLs to unknown domains, http:// instead of https://, pipe-to-shell patterns (curl | bash)
External dependency checker Warning Bundled scripts, INVOKES patterns, non-built-in #tool: refs, MCP servers not on the allowlist
Duplicate skill names Error Two skills with the same name across plugins
Aggregate description length Warning Total description text across all skills in a plugin exceeding 8192 chars
Bundled asset size Error Any single file in a skill directory over 5 MB

Config files

Two new files under eng/ let PR authors self-serve additions:

  • eng/known-domains.txt — URL domain allowlist for reference scanning. Entries with / match path prefixes; bare domains match the domain and subdomains.
  • eng/allowed-external-deps.txt — Allowlist for external dependencies (scripts, MCP servers, INVOKES, etc). Format: type:skill-name:detail.

Validation results

  • main: 0 errors, 4 pre-existing warnings (all token-size)
  • PR Add arcade-onboarding skill #6 (arcade-onboarding): 1 pre-existing error (body too long), 2 new actionable warnings (bundled script, unknown domain)

Note

This PR description was generated with the help of Copilot.

Port 5 static validation checks to pr-validation.yml:
- Reference scanning: unknown domains, http-not-https, pipe-to-shell, script-no-SRI
- External dependency checker: scripts, INVOKES, #tool: refs, MCP servers
- Duplicate skill name detection across plugins
- Aggregate description length warning per plugin
- Bundled asset size check (>5MB = error)

Add config files for self-serve allowlisting:
- eng/known-domains.txt: URL domain allowlist for reference scanning
- eng/allowed-external-deps.txt: external dependency allowlist

All checks are fork-safe (no tokens/secrets needed). New checks produce
warnings (non-blocking) except duplicate names and oversized assets.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 27, 2026 17:29
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ports additional static structural validation checks (from dotnet/skills) into this repo’s PR validation workflow, and introduces allowlist config files under eng/ so contributors can manage warnings without requiring secrets (fork-safe).

Changes:

  • Add eng/known-domains.txt and eng/allowed-external-deps.txt as line-based allowlists for URL/reference scanning and external dependency warnings.
  • Extend .github/workflows/pr-validation.yml with new validations: reference scanning, external dependency checks, duplicate skill name detection, aggregate description-length warning, and bundled asset size enforcement.
  • Update workflow path filters to run validation when eng/** changes.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
eng/known-domains.txt Adds a domain/path allowlist used by PR validation reference scanning.
eng/allowed-external-deps.txt Adds an allowlist for external-dependency-related warnings (scripts, INVOKES, MCP servers, etc.).
.github/workflows/pr-validation.yml Implements the new validation checks and loads the new eng/ allowlists; updates triggers to include eng/**.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

danmoseley and others added 2 commits March 27, 2026 11:34
- MCP server check now uses already-parsed $pluginJson instead of
  re-reading and swallowing errors in a try/catch
- Fix tool-ref format in allowed-external-deps.txt (NAME -> SKILL-NAME)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 27, 2026 17:44
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +491 to +497
# Non-built-in #tool: references in SKILL.md content
foreach ($toolMatch in [regex]::Matches($skillContent, '#tool:[\w/]+')) {
$toolName = $toolMatch.Value.Substring(6)
if (-not $BuiltInTools.Contains($toolName)) {
$depKey = "tool-ref:$($skillName):$($toolMatch.Value)"
if (-not $allowedDeps.Contains($depKey)) {
$warnings.Add("$sCtx — non-built-in tool reference '$($toolMatch.Value)' — verify this is intentional. (allow: $depKey)")
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The #tool: matcher regex (#tool:[\w/]+) will stop at - or . characters, so references like #tool:mcp-binlog-tool would be detected as #tool:mcp and produce an incorrect allowlist key/warning. Expand the character class to include hyphens (and any other valid tool-name chars used in this repo) so tool refs are parsed reliably.

Copilot uses AI. Check for mistakes.
Comment on lines +394 to +401
# ── Reference scanning (per skill) ──────────────────
# Scan SKILL.md and all reference/*.md files for URL issues
if ($knownDomains.Count -gt 0) {
$scanFiles = @($skillMdPath)
$refsDir = Join-Path $skillDir.FullName 'references'
if (Test-Path $refsDir) {
$scanFiles += Get-ChildItem -Path $refsDir -Filter '*.md' | ForEach-Object { $_.FullName }
}
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reference scanning block only scans SKILL.md + skills/<skill>/references/*.md, but the PR description/config comments mention scanning “skill/agent content”. If agents (*.agent.md) are intended to be covered, add scanning for files under the plugin’s configured agents directory as well (and keep the behavior consistent with the allowlist in eng/known-domains.txt).

Copilot uses AI. Check for mistakes.
@danmoseley
Copy link
Copy Markdown
Member Author

@copilot address feedback push to this PR

@ViktorHofer
Copy link
Copy Markdown
Member

Why not just invoke skill-validator?

@danmoseley
Copy link
Copy Markdown
Member Author

I didn't want any of the AI based validation -- setting up the PATs and refreshing regularly, it's slow, etc. Is there a way to get the other parts without copying the code?

@ViktorHofer
Copy link
Copy Markdown
Member

ViktorHofer commented Mar 28, 2026

Yes. Just invoke "skill-validator check" instead of "skill-validator eval". The binary is available directly from the "SkillValidator Nightly" release tag. I.e. https://github.com/dotnet/skills/releases/download/skill-validator-nightly/skill-validator-linux-x64.tar.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants