Feat: Memory Summarization and User Registry#2
Open
cyrexez wants to merge 6 commits intosobowalebukola:mainfrom
Open
Feat: Memory Summarization and User Registry#2cyrexez wants to merge 6 commits intosobowalebukola:mainfrom
cyrexez wants to merge 6 commits intosobowalebukola:mainfrom
Conversation
Author
Author
8a63de6 to
13870e4
Compare
Dockerfile.ollama
Outdated
| # Base Ollama image | ||
| FROM ollama/ollama:latest | ||
|
|
||
| # We comment these out because the models are already in your local volume. |
Owner
There was a problem hiding this comment.
This is an assumption that someone running it for the first time will automatically have deepseek and nomic-embed already installed on their machine.
Dockerfile.ollama
Outdated
| # Base Ollama image | ||
| FROM ollama/ollama:latest | ||
|
|
||
| # We comment these out because the models are already in your local volume. |
Owner
There was a problem hiding this comment.
This is an assumption that someone running it for the first time will automatically have deepseek and nomic-embed already installed on their machine. We shouldn't comment this out
internal/handlers/chat.go
Outdated
| userBio, err := h.Manager.GetUserBio(ctx, userID) | ||
| if err != nil { | ||
| log.Printf("Could not fetch user bio: %v", err) | ||
| userBio = "A software project called MemCortex." |
Owner
There was a problem hiding this comment.
I still feel opinionated about defualting the userBio tp 'A software project called Memcortex'
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Overview
This PR introduces a tiered memory architecture, persistent AI models, and a robust user identity layer to ensure personalized, hallucination-free AI responses.
Technical additions
Multi-User Support: Implemented a dedicated User class in Weaviate to store unique profiles, bios, and metadata.
Contextual Anchor: The system now fetches the user's "Bio" before every interaction. This bio is injected into the LLM system prompt, ensuring the AI understands the specific technical context of the user (e.g., "Software Engineer working on Go").
Isolated Memory Streams: Memories are strictly partitioned by userId, allowing for personalized summarization thresholds and retrieval per user.
Bind Mount Migration: Updated docker-compose.yml to map ./ollama_storage to /root/.ollama. This ensures that large models like deepseek-r1:1.5b persist across container restarts.
Refined service healthchecks so the Go server only attempts to connect once the Ollama model runner is fully responsive.
I also added comments on the progress of the go-server in its logs to enable future debugging and identification of issues.
Integrated a Summarizer package that uses DeepSeek-R1 to distill fragmented memories into concise summaries once a threshold (e.g., 5 items) is reached.
By anchoring responses with the User Bio, the model correctly identifies MemCortex as a software project, eliminating the hallucinations previously observed.
Embedding Worker Pool: Added a concurrent queue with a 5-retry exponential backoff to handle high-volume embedding requests without crashing the local AI server.
Self-Bootstrapping Schema: The application now manages its own schema creation on startup, automatically configuring Weaviate classes for Users and Memories.
Bi-directional Logging: The system saves both User queries and AI responses, ensuring "past context" remains coherent.
Results
User A's memories do not bleed into User B's context.
Summarizer is trigged based on the summary threshold and gives a response with past context
It has reduced hallucinations due to prompt restrictions
Proof is in the folder image_tests ( Using Thunderclient)