feat: WebMCP Security Scanner (CHK-WEB-001..008)#21
feat: WebMCP Security Scanner (CHK-WEB-001..008)#210xChitlin wants to merge 2 commits intoMikeeBuilds:mainfrom
Conversation
Add new scan_webmcp.sh scanner targeting Chrome 146's WebMCP API (navigator.modelContext) security risks. Introduces 8 new checks across a new WebMCP category, bringing the total to 71 checks across 9 categories. New checks: - CHK-WEB-001: Untrusted WebMCP origins (Critical) - CHK-WEB-002: Excessive capability declarations (Warn) - CHK-WEB-003: Unscoped modelContext grants (Warn) - CHK-WEB-004: Cross-origin service injection (Critical) - CHK-WEB-005: Data exfiltration via service access (Critical) - CHK-WEB-006: Prompt injection in service descriptions (Critical) - CHK-WEB-007: Missing service authentication (Warn) - CHK-WEB-008: Form auto-submission data leakage (Warn) Also includes: - WebMCP threat model (references/webmcp-threat-model.md) - Detailed check catalog (references/webmcp-checks.md) - Updated main check-catalog.md with WebMCP section - Version bump to 1.3.0
Summary of ChangesHello @0xChitlin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a critical new security scanner for the recently announced Chrome 146 WebMCP API. By adding a dedicated WebMCP security category with eight new checks and comprehensive documentation, the ClawPinch toolkit is positioned as a first-mover in auditing this emerging attack surface, significantly enhancing its capability to identify and mitigate novel AI agent-related vulnerabilities. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive WebMCP security scanner, significantly enhancing the security auditing capabilities of ClawPinch. While the new scan_webmcp.sh script is well-structured and robust, and the accompanying documentation is clear, a medium-severity argument injection vulnerability was identified. The CONFIG_PATH variable in scan_webmcp.sh is not properly sanitized before being passed to jq, which could allow an attacker to alter scanner behavior or exfiltrate data. Addressing this vulnerability is crucial to fully realize the security benefits of this scanner.
| (.webmcp.endpoints // [])[] .url // empty, | ||
| (.mcpServers // {}) | to_entries[]? | .value.url // empty, | ||
| (.webmcp.trustedOrigins // [])[] // empty | ||
| ' "$CONFIG_PATH" 2>/dev/null || true) |
There was a problem hiding this comment.
The script is vulnerable to argument injection. The $CONFIG_PATH variable, which can be controlled by the OPENCLAW_CONFIG_PATH environment variable (set on line 65), is passed directly to the jq command here. If an attacker sets this environment variable to a filename starting with a hyphen (e.g., -L /tmp/), jq will interpret the filename as a command-line option. This could allow an attacker to alter the scanner's execution, load untrusted jq modules, and potentially exfiltrate data from the configuration files being parsed.
Remediation:
To fix this, you should sanitize the CONFIG_PATH variable immediately after it is defined on line 65 to ensure it's always treated as a path. Add the following code snippet after line 65:
if [[ "$CONFIG_PATH" == -* ]]; then
CONFIG_PATH="./$CONFIG_PATH"
fi
Greptile OverviewGreptile SummaryThis PR introduces a new WebMCP scanner ( In the codebase, scanners are invoked by the main orchestrator ( Merge blockers found:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant Orchestrator as clawpinch.sh
participant WebMCP as scan_webmcp.sh
participant FS as Filesystem
participant JQ as jq
User->>Orchestrator: Run clawpinch (--deep optional)
Orchestrator->>WebMCP: Execute scan_webmcp.sh
WebMCP->>JQ: Verify jq exists
alt jq missing
WebMCP-->>Orchestrator: Emit CHK-WEB-000 (critical)
else jq present
WebMCP->>FS: Read openclaw.json (CONFIG_PATH)
WebMCP->>FS: Gather WebMCP-related files (workspace/skills)
loop For each check (001..008)
WebMCP->>JQ: Parse config fields for endpoints/services/tools
WebMCP->>FS: Scan related files for patterns (origins/caps/descriptions)
WebMCP-->>Orchestrator: Emit finding objects (0..n)
end
WebMCP-->>Orchestrator: Output findings as JSON array
end
Orchestrator-->>User: Render report / return JSON
|
| TRUSTED_ORIGINS="${WEBMCP_TRUSTED_ORIGINS:-$DEFAULT_TRUSTED_ORIGINS}" | ||
| IFS=',' read -ra TRUSTED_ORIGIN_LIST <<< "$TRUSTED_ORIGINS" | ||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Sensitive capability keywords | ||
| # --------------------------------------------------------------------------- | ||
| SENSITIVE_CAPS=("filesystem" "shell" "exec" "network" "outbound" "process" "admin" "sudo" "root" "system" "os" "child_process" "spawn" "eval") | ||
| SENSITIVE_DATA_PATTERNS=("memory" "context" "history" "conversation" "agent_state" "session" "credentials" "secrets" "keychain" "token" "MEMORY.md" "SOUL.md" "USER.md") | ||
| PROMPT_INJECTION_PATTERNS=( | ||
| "ignore previous" | ||
| "ignore all previous" | ||
| "disregard" | ||
| "forget your instructions" | ||
| "new instructions" | ||
| "override" | ||
| "you are now" | ||
| "act as" |
There was a problem hiding this comment.
Host-only trust check
check_untrusted_origins extracts host (stripping scheme) and then calls is_trusted_origin "$host", but the default allow-list includes scheme prefixes like chrome-extension:// / moz-extension:// / safari-web-extension://. Those entries can never match a host-only string, so extension origins will always be flagged as untrusted. Either compare against the full origin (scheme://host[:port]) or remove scheme-based entries from DEFAULT_TRUSTED_ORIGINS and keep the list host-only.
Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/scan_webmcp.sh
Line: 74:90
Comment:
**Host-only trust check**
`check_untrusted_origins` extracts `host` (stripping scheme) and then calls `is_trusted_origin "$host"`, but the default allow-list includes scheme prefixes like `chrome-extension://` / `moz-extension://` / `safari-web-extension://`. Those entries can never match a host-only string, so extension origins will always be flagged as untrusted. Either compare against the full origin (`scheme://host[:port]`) or remove scheme-based entries from `DEFAULT_TRUSTED_ORIGINS` and keep the list host-only.
How can I resolve this? If you propose a fix, please make it concise.| --- | ||
| name: clawpinch | ||
| description: "Security audit toolkit for OpenClaw deployments. Scans 63 checks across 8 categories. Use when asked to audit security, harden an installation, check for vulnerabilities, or review config safety." | ||
| description: "Security audit toolkit for OpenClaw deployments. Scans 71 checks across 9 categories. Use when asked to audit security, harden an installation, check for vulnerabilities, or review config safety." | ||
| version: "1.2.0" | ||
| author: MikeeBuilds | ||
| license: MIT |
There was a problem hiding this comment.
Version metadata mismatch
This PR updates the advertised check counts/categories and bumps package.json to 1.3.0, but SKILL.md frontmatter still says version: "1.2.0". If SKILL.md is treated as the canonical skill manifest, this will publish inconsistent version info.
Prompt To Fix With AI
This is a comment left during a code review.
Path: SKILL.md
Line: 1:6
Comment:
**Version metadata mismatch**
This PR updates the advertised check counts/categories and bumps `package.json` to `1.3.0`, but `SKILL.md` frontmatter still says `version: "1.2.0"`. If `SKILL.md` is treated as the canonical skill manifest, this will publish inconsistent version info.
How can I resolve this? If you propose a fix, please make it concise.| is_trusted_origin() { | ||
| local origin="$1" | ||
| for trusted in "${TRUSTED_ORIGIN_LIST[@]}"; do | ||
| trusted="$(echo "$trusted" | xargs)" # trim whitespace | ||
| if [[ -z "$trusted" ]]; then continue; fi | ||
| # Exact match or substring match (origin starts with trusted prefix) | ||
| if [[ "$origin" == "$trusted" ]] || [[ "$origin" == "${trusted}"* ]] || [[ "$origin" == *"://${trusted}"* ]] || [[ "$origin" == *"://${trusted}:"* ]]; then | ||
| return 0 | ||
| fi | ||
| done | ||
| return 1 |
There was a problem hiding this comment.
Overbroad trust matching
is_trusted_origin() treats an origin as trusted if the trusted token appears anywhere after :// (e.g. [[ "$origin" == *"://${trusted}"* ]]). That can incorrectly mark an untrusted URL as trusted when the trusted domain appears in the path/query (e.g., a redirect param). Since CHK-WEB-001 is a critical trust boundary, this should only match the actual origin/host (or a well-defined prefix), not arbitrary substrings.
Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/scan_webmcp.sh
Line: 123:133
Comment:
**Overbroad trust matching**
`is_trusted_origin()` treats an origin as trusted if the trusted token appears anywhere after `://` (e.g. `[[ "$origin" == *"://${trusted}"* ]]`). That can incorrectly mark an untrusted URL as trusted when the trusted domain appears in the path/query (e.g., a redirect param). Since CHK-WEB-001 is a critical trust boundary, this should only match the actual origin/host (or a well-defined prefix), not arbitrary substrings.
How can I resolve this? If you propose a fix, please make it concise.| if [[ -f "$CONFIG_PATH" ]] && command -v jq &>/dev/null; then | ||
| # Extract all service declarations with their origins (null-safe) | ||
| local svc_origins | ||
| svc_origins=$(jq -r ' | ||
| [(.webmcp.services // [])[] | {name: (.name // "unnamed"), origin: (.origin // null // "unknown")}] | | ||
| map(select(.origin != null)) | | ||
| group_by(.origin) | | ||
| map(select(length > 0) | {origin: .[0].origin, services: [.[].name]}) | | ||
| .[] | "\(.origin // "unknown")|\(.services | join(","))" | ||
| ' "$CONFIG_PATH" 2>/dev/null || true) |
There was a problem hiding this comment.
Incorrect origin grouping
In CHK-WEB-004, the jq pipeline uses group_by(.origin) without sorting the array by .origin first (scripts/scan_webmcp.sh:438-444). In jq, group_by only groups adjacent equal keys, so repeated origins that aren’t contiguous can be split into multiple groups, leading to incorrect all_origins detection and false negatives/positives for origin-isolation warnings. Also origin: (.origin // null // "unknown") makes .origin never null, so map(select(.origin != null)) is ineffective.
Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/scan_webmcp.sh
Line: 435:444
Comment:
**Incorrect origin grouping**
In CHK-WEB-004, the jq pipeline uses `group_by(.origin)` without sorting the array by `.origin` first (`scripts/scan_webmcp.sh:438-444`). In jq, `group_by` only groups adjacent equal keys, so repeated origins that aren’t contiguous can be split into multiple groups, leading to incorrect `all_origins` detection and false negatives/positives for origin-isolation warnings. Also `origin: (.origin // null // "unknown")` makes `.origin` never null, so `map(select(.origin != null))` is ineffective.
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (1)
The header still says this is a reference for “all 63 checks” ( Prompt To Fix With AIThis is a comment left during a code review.
Path: references/check-catalog.md
Line: 1:4
Comment:
**Stale total check count**
The header still says this is a reference for “all 63 checks” (`references/check-catalog.md:3-4`), but this PR adds the WebMCP section (8 new checks), bringing the total to 71. This makes the catalog internally inconsistent with `SKILL.md`/the PR metadata and will mislead users relying on the catalog as canonical.
How can I resolve this? If you propose a fix, please make it concise. |
🛡️ WebMCP Security Scanner
Adds a new scanner category targeting Chrome 146's WebMCP API (
navigator.modelContext) — the first security tooling for this emerging attack surface.What is WebMCP?
Chrome 146 introduces WebMCP, allowing websites to declare structured services that AI agents can interact with directly (instead of navigating the human UI). While powerful, this opens new security vectors that ClawPinch should audit.
New Checks (8)
Files Changed
scripts/scan_webmcp.sh— New scanner (840+ lines) with full JSON output, deep scan support, and helpers fallbackreferences/webmcp-threat-model.md— Comprehensive threat model covering 8 attack vectorsreferences/webmcp-checks.md— Detailed check catalog with evidence, remediation, and auto-fixreferences/check-catalog.md— Updated with WebMCP sectionSKILL.md— Updated to 71 checks across 9 categoriespackage.json— Version bump to 1.3.0, added webmcp keywordsWhy This Matters
WebMCP is brand new (announced today) and nobody has built security tooling for it yet. This positions ClawPinch as the first-mover in WebMCP security auditing.
Context
Inspired by this tweet from @liadyosef about WebMCP + MCP Apps convergence toward agentic UI.
Total checks: 63 → 71 | Categories: 8 → 9 | Version: 1.2.1 → 1.3.0