Skip to content

🔒 [security fix] Fix URL Injection in SkillFetcher#5

Open
frostmute wants to merge 1 commit intomainfrom
security-fix-url-injection-fetcher-11110538775753625202
Open

🔒 [security fix] Fix URL Injection in SkillFetcher#5
frostmute wants to merge 1 commit intomainfrom
security-fix-url-injection-fetcher-11110538775753625202

Conversation

@frostmute
Copy link
Copy Markdown
Owner

🎯 What:

Fixed a URL injection / unescaped user input vulnerability in claw2manus/fetcher.py.

⚠️ Risk:

Malicious users could potentially craft author or name strings containing special characters (like ../ or #) to redirect requests to unexpected URLs or access sensitive files via path traversal on the target servers (GitHub, clawhub.ai).

🛡️ Solution:

Used urllib.parse.quote(..., safe='') to properly escape all user-controllable variables before they are interpolated into URL templates. This ensures that any special characters are safely URL-encoded, treating them as literal path segments rather than control characters.

Verified the fix with a new test suite tests/test_fetcher.py and ensured core functionality is preserved.


PR created automatically by Jules for task 11110538775753625202 started by @frostmute

This commit addresses a security vulnerability where user-provided inputs
(author and name) were directly formatted into URLs without proper escaping.
We now use `urllib.parse.quote` with `safe=''` to sanitize these inputs,
preventing potential URL injection and path traversal attacks.

Affected methods:
- fetch_skill_from_github
- fetch_skill_from_clawhub_website
- discover_author_via_github
- fetch_skill (print statement)

New tests in `tests/test_fetcher.py` verify the correct escaping of
special characters.

Co-authored-by: frostmute <989225+frostmute@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces URL encoding for skill and author identifiers in the SkillFetcher class to prevent malformed requests and adds a corresponding test suite. Feedback suggests further hardening the security by explicitly encoding dots to prevent path traversal and wrapping search terms in quotes for the GitHub Search API. Additionally, it is recommended to refactor the test mocking strategy to avoid global side effects and remove unused imports.

Comment thread claw2manus/fetcher.py
Comment on lines +13 to +15
quoted_author = urllib.parse.quote(author, safe='')
quoted_name = urllib.parse.quote(name, safe='')
url = self.CLAW_HUB_RAW_GITHUB_URL.format(author=quoted_author, name=quoted_name)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

While urllib.parse.quote(..., safe='') correctly encodes slashes to prevent path traversal, it does not encode dots (.) by default. This allows an attacker to provide .. as an author or name, which can still lead to path traversal within the GitHub repository (e.g., accessing files outside the skills/ directory). Explicitly encoding dots or validating the input is recommended to fully mitigate this risk.

Suggested change
quoted_author = urllib.parse.quote(author, safe='')
quoted_name = urllib.parse.quote(name, safe='')
url = self.CLAW_HUB_RAW_GITHUB_URL.format(author=quoted_author, name=quoted_name)
quoted_author = urllib.parse.quote(author, safe='').replace('.', '%2E')
quoted_name = urllib.parse.quote(name, safe='').replace('.', '%2E')
url = self.CLAW_HUB_RAW_GITHUB_URL.format(author=quoted_author, name=quoted_name)

Comment thread claw2manus/fetcher.py
Comment on lines +26 to +27
quoted_name = urllib.parse.quote(name, safe='')
url = self.CLAW_HUB_WEBSITE_URL.format(name=quoted_name)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Similar to the GitHub fetcher, dots are not encoded by urllib.parse.quote, which could allow path traversal via .. sequences in the name parameter. Encoding dots ensures the request remains constrained to the intended path on the target server.

Suggested change
quoted_name = urllib.parse.quote(name, safe='')
url = self.CLAW_HUB_WEBSITE_URL.format(name=quoted_name)
quoted_name = urllib.parse.quote(name, safe='').replace('.', '%2E')
url = self.CLAW_HUB_WEBSITE_URL.format(name=quoted_name)

Comment thread claw2manus/fetcher.py
Comment on lines +50 to +51
quoted_name = urllib.parse.quote(name, safe='')
url = self.GITHUB_SEARCH_API_URL.format(name=quoted_name)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The GitHub Search API interprets characters like spaces and + as term separators in the q parameter. Even when URL-encoded, these characters can break the path: qualifier logic. Wrapping the {name} placeholder in double quotes within the search query ensures it is treated as a single literal path segment by the search engine.

Suggested change
quoted_name = urllib.parse.quote(name, safe='')
url = self.GITHUB_SEARCH_API_URL.format(name=quoted_name)
quoted_name = urllib.parse.quote(name, safe='')
url = self.GITHUB_SEARCH_API_URL.format(name=f'"{quoted_name}"')

Comment thread tests/test_fetcher.py
Comment on lines +5 to +7
mock_requests = MagicMock()
sys.modules['requests'] = mock_requests
sys.modules['bs4'] = MagicMock()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Mocking dependencies by directly modifying sys.modules is a global side effect that can interfere with other tests and is generally considered brittle. It is better to use unittest.mock.patch to locally mock the requests and bs4 modules. Additionally, the patch import on line 10 is currently unused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant