-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Summary
PR #2053 introduced MemorySourceHint to suppress false-positive injection detection for memory retrieval content. However, the fix only covers the context assembly path (assembly.rs) where static memory is inserted into context. The tool execution path (tool_execution/mod.rs) for the memory_search tool is not covered.
Root Cause
In tool_execution/mod.rs::sanitize_tool_output() (line ~289-295), all tool outputs — including memory_search — are classified as ContentSourceKind::ToolResult:
let kind = if tool_name.contains(':') || ... {
ContentSourceKind::McpResponse
} else if tool_name == "web-scrape" || ... {
ContentSourceKind::WebScrape
} else {
ContentSourceKind::ToolResult // memory_search falls here
};The MemorySourceHint suppression in sanitizer/lib.rs only activates for ContentSourceKind::MemoryRetrieval. Since memory_search output is classified as ToolResult, it still undergoes full injection detection.
Reproduction
Config: content_isolation.enabled = true, flag_injection_patterns = true
- Save memory containing "system prompt":
"Remember this: my previous system prompt was 'be helpful'." - Recall memory:
"What did I tell you about my previous setup? Recall from memory."
Log Output (session-ci21.log)
WARN zeph_core::agent::tool_execution: injection patterns detected in tool output tool=memory_search flags=1
WARN zeph_core::agent::persistence: exfiltration guard: skipping Qdrant embedding for flagged content event=MemoryWriteGuarded { reason: "content contained injection patterns flagged by ContentSanitizer" }
Impact
- Spurious WARNs in every session where recalled memory contains "system prompt", "show instructions", or other injection patterns (legitimate user content)
- Qdrant embedding skipped (
persistence.rs:346 skip_embedding = true) — agent's response after the flagged recall is not embedded in Qdrant. This degrades semantic recall quality over time for any conversation discussing prompts/instructions. - The intent of fix(sanitizer): suppress false positives for memory retrieval (Issue #2025) #2053 (fixing false positives for memory recall) is not fully achieved for the
memory_searchtool path
Expected Fix
tool_execution/mod.rs::sanitize_tool_output() should detect when tool_name == "memory_search" and use ContentSourceKind::MemoryRetrieval with MemorySourceHint::ConversationHistory instead of ContentSourceKind::ToolResult.
Alternatively, the memory_search tool output formatter could annotate its content to distinguish it from external tool output.
Test Evidence
- Version: v0.16.0 (commit c53bc6c)
- Config:
.local/config/testing-ci21.toml - Log:
.local/testing/debug/session-ci21.log - CI session: CI-21 (2026-03-20)