fix(web): ensure fresh URL fetch context reaches model#208
fix(web): ensure fresh URL fetch context reaches model#208punishell wants to merge 4 commits intosipeed:mainfrom
Conversation
Make web_fetch return the fetched payload to the LLM (not just metadata), fix HTML text extraction regex replacements, and add a prompt rule requiring fresh web_fetch calls for URL visit/check requests to prevent stale memory answers. Co-authored-by: Cursor <cursoragent@cursor.com>
|
@Zepan Ensures a fresh URL fetch context reaches the model — fixes web_fetch tool breaking silently when context is reused. Recommendation: Merge. +8/-6, small correctness fix for web tools. |
Leeaandrob
left a comment
There was a problem hiding this comment.
PR #208 Deep Review — @punishell
Hey @punishell, thanks for diagnosing this. The root cause is spot-on: ForLLM was sending only metadata ("Fetched 5000 bytes from ...") but never the actual page content. The LLM literally could not see what was fetched, so it hallucinated the page content. This PR correctly fixes that.
Verification
- Checked out branch locally:
fix/web-fetch-fresh-url go vet ./pkg/tools/... ./pkg/agent/...— cleango test ./pkg/tools/... -run "Web|Fetch|Search"— all 10 tests passinggo test ./pkg/agent/...— all 10 tests passing- Traced the
ForLLM→contentForLLM→ tool result message path (agent/loop.go:551-558) — confirmedForLLMis what the model receives as the tool response - Verified
ReplaceAllLiteralString("")vsReplaceAllString("")produce identical results when replacement is empty string (no$expansion possible)
Summary of Findings
HIGH (Should Fix)
- H1: Existing test
TestWebTool_WebFetch_Successnow passes by coincidence, not by intent
MEDIUM
- M1:
ReplaceAllLiteralString→ReplaceAllStringis a no-op change (adds diff noise)
LOW
- L1: System prompt rule is English-only
POSITIVE
- Critical bug fix —
ForLLMwas only sending metadata, never the actual content.web_fetchwas fundamentally broken. - System prompt defense-in-depth rule is a reasonable pattern to prevent hallucination
- Small, focused changeset — exactly what's needed, nothing more
- Content extraction and truncation logic is correctly preserved
Verdict: REQUEST CHANGES
What needs to change before merge:
- Update
TestWebTool_WebFetch_Success(web_test.go:39-42) to validate the new ForLLM format (see H1) - Remove the no-op
ReplaceAllLiteralString→ReplaceAllStringchanges, or justify why they're needed (see M1)
Estimated effort: ~30 minutes. Happy to re-review.
Use URL-presence rules instead of English-only keywords, and inject a runtime system note when user messages contain URLs so responses are based on fresh web_fetch results in the same turn.
Leeaandrob
left a comment
There was a problem hiding this comment.
PR #208 Re-Review (Post Commit ed0a2e8)
Verdict: REQUEST_CHANGES ❌
@punishell — Thanks for addressing the earlier feedback. The no-op ReplaceAllLiteralString changes are gone, and the English-only keyword approach was replaced with regex-based URL detection. However, the new commit introduces a critical false-positive problem with the URL regex that makes this worse than the original state for many common conversations.
What was fixed (from previous review)
- ✅ No-op
ReplaceAllLiteralString→ReplaceAllStringchanges removed — Clean diff now. - ✅ English-only keyword list replaced — The new approach uses URL regex detection, which is language-agnostic.
- ✅ Core bug fix intact —
ForLLM: string(resultJSON)correctly sends actual content to the model. - ✅ All tests pass —
go test -race ./pkg/tools/... ./pkg/agent/...both PASS. - ✅
go vetclean
New Findings
[HIGH] URL regex has severe false positives — matches filenames, JS libraries, and PicoClaw's own files
File: pkg/agent/loop.go:48
var userMessageURLRegex = regexp.MustCompile(`(?i)(https?://[^\s<>"')\]]+|\b[a-z0-9][a-z0-9.-]*\.[a-z]{2,}\b(?:/[^\s<>"')\]]*)?)`) The bare-domain pattern \b[a-z0-9][a-z0-9.-]*\.[a-z]{2,}\b matches ANY word.extension where extension is 2+ characters. Tested locally:
| Input | Match | Problem |
|---|---|---|
config.json |
✅ MATCH | Every config discussion triggers it |
MEMORY.md |
✅ MATCH | PicoClaw's own workspace file! |
node.js |
✅ MATCH | Extremely common in tech conversations |
vue.js |
✅ MATCH | Common JS library |
three.js |
✅ MATCH | Common JS library |
my-app.service |
✅ MATCH | Systemd service discussions |
https://example.com |
✅ MATCH | Correct |
github.com/sipeed |
✅ MATCH | Correct |
When a false positive fires, the system injects:
[System note: This message includes URL(s). Before answering, you must call web_fetch for each URL...]
The model then tries to web_fetch("config.json") or web_fetch("MEMORY.md"), which fails and wastes tool calls. This makes the user experience worse than without the feature for any tech conversation mentioning filenames.
Fix: Only match https?:// URLs (not bare domains). The bare domain pattern causes far more harm than good:
var userMessageURLRegex = regexp.MustCompile(`https?://[^\s<>"')\]]+`)If bare domain support is desired, require at least a TLD from a known list (.com, .org, .io, .dev, etc.) rather than any 2+ char extension.
[MEDIUM] No tests for enforceFreshWebFetchHint
File: pkg/agent/loop.go:281-289
The new function has zero test coverage. Given the regex issues above, tests are especially important to catch regressions. At minimum:
func TestEnforceFreshWebFetchHint(t *testing.T) {
// Should inject note for real URLs
assert.Contains(t, enforceFreshWebFetchHint("check https://example.com"), "[System note")
// Should NOT inject note for filenames
assert.Equal(t, "edit config.json", enforceFreshWebFetchHint("edit config.json"))
// Should NOT inject note for plain text
assert.Equal(t, "hello world", enforceFreshWebFetchHint("hello world"))
}[LOW] System prompt rule #3 and the enforceFreshWebFetchHint are redundant
The PR adds URL enforcement in two places:
- System prompt rule #3 in
context.go— tells the model generically to fetch URLs enforceFreshWebFetchHintinloop.go— detects URLs and injects a system note per-message
Both serve the same purpose. The per-message injection (#2) is more forceful and makes #1 somewhat redundant. Consider keeping only one approach to avoid double-prompting the model.
Summary
| Severity | Count | Details |
|---|---|---|
| CRITICAL | 0 | — |
| HIGH | 1 | URL regex false positives (config.json, MEMORY.md, node.js, etc.) |
| MEDIUM | 1 | No tests for enforceFreshWebFetchHint |
| LOW | 1 | Redundant enforcement in system prompt + per-message injection |
| POSITIVE | 1 | Core bug fix (ForLLM now sends actual content) is correct and valuable |
The core fix in web.go (line 407) is valuable and correct. The problem is entirely in the new URL detection logic in loop.go. I'd recommend:
- Simplify the regex to only match
https?://URLs (drop bare domain matching) - Add unit tests for
enforceFreshWebFetchHint - Consider removing the redundant system prompt rule #3 since the per-message injection is more reliable
Co-authored-by: Leandro Barbosa <leandrobar93@gmail.com>
Add unit tests for enforceFreshWebFetchHint, narrow URL detection to explicit http/https links to avoid filename false positives, and remove the redundant global system-prompt rule in favor of runtime enforcement.
Make web_fetch return the fetched payload to the LLM (not just metadata), fix HTML text extraction regex replacements, and add a prompt rule requiring fresh web_fetch calls for URL visit/check requests to prevent stale memory answers.
The bot was constantly hallucinating when asked e.g to visit the website