Summary
Investigate how skill-validator could support organizations defining their own content policies and passing them to the validator for checking. The goal is to let orgs enforce domain-specific rules (e.g., forbidden patterns, required disclaimers, restricted domains) without those policies being baked into the core validator.
Motivation
skill-validator currently performs structural validation: well-formedness, frontmatter schema, token budgets, link integrity, path traversal, orphan detection. These checks are universal — they apply to every skill regardless of where it's deployed.
However, organizations running internal skill marketplaces or curated registries often have content policies that go beyond structural correctness. Examples:
- Privilege escalation keywords: Flag skills that instruct agents to use
sudo, --no-verify, chmod 777, or phrases like "bypass security" or "disable authentication." These are rarely legitimate in skills about project workflows but may be fine in skills about Linux system administration.
- Destructive command patterns: Flag
rm -rf, DROP TABLE, git push --force (without --force-with-lease), DELETE FROM (without WHERE), kill -9, dd if=. Some skills (like git-workflow skills) legitimately reference these as documentation, so these should be flaggable rather than hard-blocked.
- Data exfiltration patterns: Flag suspicious domains (
ngrok.io, requestbin.com, webhook.site, raw IPs in URLs), pipe-to-shell patterns (curl ... | sh, wget ... | bash), non-HTTPS external URLs, and large base64-encoded blobs.
- Hidden content in HTML comments: Flag
<!-- ... --> blocks in Markdown files, which are invisible when rendered but present in the agent's raw context window. Whether this is a concern depends on the org's threat model and skill content — skills that include HTML templates as assets may legitimately contain comments.
These are all legitimate concerns, but they're inherently subjective and domain-dependent. A "destructive command" policy makes sense for a corporate skill marketplace but would be noise for an open-source skill about database administration.
Questions to investigate
-
What format should policy definitions use? Options include:
- A YAML/JSON config file with pattern lists and severity levels
- A directory of rule files (one per check) with metadata
- A Go plugin interface for compiled rules
- Some combination
-
How should policies integrate with the CLI? Options include:
--policy-file rules.yaml flag
- A
.skill-validator/policy.yaml convention in the repo root
- Environment variable pointing to a policy directory
-
How should policy findings be reported? Should they use the same result types (ERROR/WARNING/INFO) as structural checks, or a separate "POLICY" category? Should they be distinguishable in JSON output?
-
What's the right boundary between "structural" and "policy"? The current implicit rule is: if a check can have legitimate false positives depending on the skill's domain, it's policy. If it catches something that's never legitimate in any skill file, it's structural. Is this the right line?
-
Should policies be able to scope by file type or path? An org might want to flag HTML comments in SKILL.md (where hidden instructions are suspicious) but allow them in assets/*.html (where they're part of a template). Glob-based scoping could handle this.
Non-goals
- Building a full rules engine or DSL. The first iteration should be simple enough to define in a config file.
- Handling natural-language analysis (e.g., "does this skill encourage bad practices?"). That's the domain of the LLM judge, not static analysis.
- Replacing platform-specific CI linting. Orgs will always have some checks that are too specific for a general-purpose config format. The goal is to cover the common 80%.
Desired outcome
A design proposal (can be an issue comment or a short doc in the repo) covering the recommended approach, with enough detail to implement a first version. The first iteration doesn't need to be perfect — it just needs to be a credible foundation that orgs can start using.
Summary
Investigate how skill-validator could support organizations defining their own content policies and passing them to the validator for checking. The goal is to let orgs enforce domain-specific rules (e.g., forbidden patterns, required disclaimers, restricted domains) without those policies being baked into the core validator.
Motivation
skill-validator currently performs structural validation: well-formedness, frontmatter schema, token budgets, link integrity, path traversal, orphan detection. These checks are universal — they apply to every skill regardless of where it's deployed.
However, organizations running internal skill marketplaces or curated registries often have content policies that go beyond structural correctness. Examples:
sudo,--no-verify,chmod 777, or phrases like "bypass security" or "disable authentication." These are rarely legitimate in skills about project workflows but may be fine in skills about Linux system administration.rm -rf,DROP TABLE,git push --force(without--force-with-lease),DELETE FROM(withoutWHERE),kill -9,dd if=. Some skills (like git-workflow skills) legitimately reference these as documentation, so these should be flaggable rather than hard-blocked.ngrok.io,requestbin.com,webhook.site, raw IPs in URLs), pipe-to-shell patterns (curl ... | sh,wget ... | bash), non-HTTPS external URLs, and large base64-encoded blobs.<!-- ... -->blocks in Markdown files, which are invisible when rendered but present in the agent's raw context window. Whether this is a concern depends on the org's threat model and skill content — skills that include HTML templates as assets may legitimately contain comments.These are all legitimate concerns, but they're inherently subjective and domain-dependent. A "destructive command" policy makes sense for a corporate skill marketplace but would be noise for an open-source skill about database administration.
Questions to investigate
What format should policy definitions use? Options include:
How should policies integrate with the CLI? Options include:
--policy-file rules.yamlflag.skill-validator/policy.yamlconvention in the repo rootHow should policy findings be reported? Should they use the same result types (ERROR/WARNING/INFO) as structural checks, or a separate "POLICY" category? Should they be distinguishable in JSON output?
What's the right boundary between "structural" and "policy"? The current implicit rule is: if a check can have legitimate false positives depending on the skill's domain, it's policy. If it catches something that's never legitimate in any skill file, it's structural. Is this the right line?
Should policies be able to scope by file type or path? An org might want to flag HTML comments in SKILL.md (where hidden instructions are suspicious) but allow them in
assets/*.html(where they're part of a template). Glob-based scoping could handle this.Non-goals
Desired outcome
A design proposal (can be an issue comment or a short doc in the repo) covering the recommended approach, with enough detail to implement a first version. The first iteration doesn't need to be perfect — it just needs to be a credible foundation that orgs can start using.