Skip to content

Fix memory dedup to use exact entry matching#22

Open
ZoranPandovski wants to merge 3 commits intomainfrom
fix-dedup
Open

Fix memory dedup to use exact entry matching#22
ZoranPandovski wants to merge 3 commits intomainfrom
fix-dedup

Conversation

@ZoranPandovski
Copy link
Member

@ZoranPandovski ZoranPandovski commented Mar 23, 2026

The duplicate check in hippocampus used if text in content which matches substrings across the whole file. This causes two problems:
e.g

  • "Use httpx" blocks "Use httpx with timeout=15" (false reject)
  • "Use httpx" in a rule's metadata or another entry's text blocks unrelated entries (false reject)

Now we parse each - entry line, strip the metadata, and compare exact entry text.

@torrmal I think there is another bug when we do compaction and we loos metadata. I will open separate PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants