Cross-agent test harness: verify skills work in Claude Code, Copilot, and Cursor

Skills are advertised as cross-agent compatible but only manually tested in Claude Code today.

**Proposed:**
- A small `tests/` directory with one canonical prompt per skill
- A runner that captures outputs from each supported agent
- A diff/scoring step (could lean on the AI-Eval-Skills toolkit)

This becomes the regression net when skills are edited.