From 8df0678e36d547ec819cb7b084c55f4def5b7497 Mon Sep 17 00:00:00 2001 From: Cursor Agent Date: Fri, 23 Jan 2026 19:09:26 +0000 Subject: [PATCH] Add Memoria technical overview blog post for basehub Co-authored-by: lorenzo --- blog/memoria-technical-overview.mdx | 269 ++++++++++++++++++++++++++++ 1 file changed, 269 insertions(+) create mode 100644 blog/memoria-technical-overview.mdx diff --git a/blog/memoria-technical-overview.mdx b/blog/memoria-technical-overview.mdx new file mode 100644 index 0000000..8409443 --- /dev/null +++ b/blog/memoria-technical-overview.mdx @@ -0,0 +1,269 @@ +--- +title: "Memoria: A Technical Overview of Venice's Memory System" +description: "A deep dive into how Venice remembers your conversations while keeping your data private" +date: "2026-01-23" +author: "Venice Team" +tags: + - privacy + - technology + - memoria + - ai + - features +image: null +featured: true +--- + +# Memoria: A Technical Overview of Venice's Memory System + +*A deep dive into how Venice remembers your conversations while keeping your data private* + +--- + +## Introduction + +Venice is proud to introduce **Memoria**, our privacy-preserving memory system that enables AI to remember context from your past conversations. Unlike traditional cloud-based memory systems, Memoria stores all data locally in your browser, ensuring that your memories never leave your device. + +This technical overview explains how Memoria works, what data it stores, and answers common questions about the feature. + +--- + +## How Memoria Works + +### The Core Concept + +Memoria uses **vector embeddings** to understand and retrieve relevant memories. When you chat with Venice: + +1. **During conversation**: Your messages are converted into mathematical representations (vectors) that capture their meaning +2. **Automatic extraction**: Every few messages, Venice extracts key information and insights from the conversation +3. **Intelligent retrieval**: When you start a new conversation, Memoria searches for relevant past memories and provides them as context to the AI + +This creates a more personalized experience where the AI can reference things you've discussed before, remember your preferences, and build on previous conversations. + +### Architecture Overview + +``` +┌─────────────────────────────────────────────────────────────────────┐ +│ Your Browser │ +├─────────────────────────────────────────────────────────────────────┤ +│ IndexedDB (Local Storage) │ +│ ├── Memory records (text summaries) │ +│ ├── Vector embeddings (1024-dimensional, compressed) │ +│ ├── Sparse search tokens │ +│ └── Metadata (timestamps, sources, importance scores) │ +├─────────────────────────────────────────────────────────────────────┤ +│ FAISS WASM Vector Index │ +│ └── In-memory index for fast similarity search │ +└─────────────────────────────────────────────────────────────────────┘ + ▲ + │ Search queries + ▼ +┌─────────────────────────────────────────────────────────────────────┐ +│ Venice Servers (Stateless) │ +├─────────────────────────────────────────────────────────────────────┤ +│ • Generate embeddings for search queries │ +│ • Extract memory summaries from conversations │ +│ • No memory storage - all processing is transient │ +└─────────────────────────────────────────────────────────────────────┘ +``` + +--- + +## Data Storage & Privacy + +### What Data Is Stored? + +Memoria stores the following data locally in your browser: + +| Data Type | Description | Example | +|-----------|-------------|---------| +| **Memory Text** | Summary of conversation insights | "User is learning Python programming" | +| **Vector Embedding** | 1024-dimensional mathematical representation | Compressed to ~1.4KB per memory | +| **Sparse Tokens** | Keywords for hybrid search | ["python", "programming", "learning"] | +| **Source** | Where the memory came from | "venice" (auto-extracted), "manual", or file identifier | +| **Memory Type** | Classification of the memory | "extracted_summary", "user" | +| **Importance Score** | 1-10 rating of relevance | Lower = more important | +| **Timestamps** | Creation and access times | ISO date strings | + +### Privacy Guarantees + +1. **Local-only storage**: All memory data is stored in IndexedDB within your browser. It never leaves your device unless explicitly shared. + +2. **Vector salting**: Even the mathematical representations of your memories are transformed using a user-specific cryptographic salt derived from your encryption key. This means: + - Your embeddings are unique to you + - They cannot be correlated with other users' data + - Even if intercepted, they cannot be reverse-engineered + +3. **No server-side storage**: Venice servers only generate embeddings transiently. They do not store your memories or the content used to create them. + +4. **Anonymized model protection**: You can disable memory sharing with third-party ("anonymized") models that may have different privacy guarantees than Venice's private models. + +--- + +## Chat Memory vs. Character Memory + +Memoria provides two separate memory pools: + +### Chat Memory + +- Used in regular conversations (non-character chats) +- Enabled via Settings → Memory → Chat Memory toggle +- Memories are scoped to a special "chat_memory" identifier +- Ideal for general knowledge about you, your preferences, and ongoing projects + +### Character Memory + +- Used when chatting with AI characters +- Each character has their own isolated memory pool +- Memories are scoped by character ID +- Enables characters to "remember" your relationship and past conversations + +**Important**: Chat Memory and Character Memory are completely separate. Memories from regular chats won't appear in character conversations, and vice versa. + +--- + +## Document Uploads + +You can enhance memory by uploading documents (PDF or text files): + +### How Document Processing Works + +1. **Text extraction**: Documents are parsed to extract readable text +2. **Chunking**: Large documents are split into overlapping segments (~1200 characters with 200-character overlap) +3. **Embedding**: Each chunk is converted to a vector embedding +4. **Storage**: Chunks are stored with a unique source identifier derived from the filename + +### Using Document Memories + +- **Enable/Disable per document**: You can toggle individual documents on/off without deleting them +- **Source filtering**: Disabled documents are excluded from memory searches +- **File limits**: Up to 50 documents for Chat Memory, 15 per character + +### Best Practices for Documents + +- Upload reference material you want the AI to remember +- Use clear, descriptive filenames (the filename is included in the first chunk) +- Supported formats: PDF, plain text, and other text-based files +- Password-protected PDFs are not supported + +--- + +## Frequently Asked Questions + +### Does disabling Chat Memory delete all my memories? + +**No.** Disabling Chat Memory only stops the AI from accessing and creating new memories. Your existing memories remain stored in your browser and will be available again if you re-enable the feature. + +To actually delete memories, you must explicitly delete them through: +- Settings → Memory → Delete individual memories +- Settings → Memory → Delete all (for interaction memories) +- Settings → Memory → Delete all documents (for uploaded files) + +### Does clearing browser data / logging out delete my memories? + +**Yes, potentially.** Since memories are stored in IndexedDB (browser local storage): + +- **Clearing site data for venice.ai** will delete all memories +- **Clearing all browser data** will delete all memories +- **Logging out** does not delete memories (they remain in IndexedDB) +- **Using a different browser or device** means you won't have access to memories stored elsewhere + +**Recommendation**: Consider your memories as browser-specific. If you use multiple devices, each will have its own independent memory store. + +### Does restoring from a backup restore my memories? + +**No, not currently.** The backup/restore system backs up your conversations, characters, and settings, but **memories are not included** in backups. + +This is a known limitation. Memories are stored in a separate database structure optimized for vector search, and backup integration is planned for a future release. + +### How do I best use this feature? + +**For Chat Memory:** + +1. Enable Chat Memory in Settings → Memory +2. Optionally disable "Auto-generate memories" if you prefer manual control +3. Add important facts manually using the "Add Memory" button +4. Upload reference documents you want the AI to remember +5. Periodically review and clean up irrelevant memories + +**For Character Memory:** + +1. Enable Character Memory in the character's settings +2. Use the "Extraction prompt" field to customize what the AI remembers about conversations +3. Upload character-specific documents (lore, backstory, reference material) +4. Review memories in the character's Memory tab + +**Pro tips:** + +- Memories work best when they're concise and factual +- The AI retrieves the most relevant memories based on your current message +- You can edit memories to correct or refine them +- Toggle off documents temporarily rather than deleting if you might need them later + +### What data is shared with anonymized (third-party) models? + +When using models not hosted directly by Venice (marked as "Anonymized"): + +- **By default**: Memories are NOT shared with these models +- **If enabled**: The "Share memories with anonymized models" toggle allows memory context to be sent +- **Privacy note**: Third-party providers may have different data retention policies than Venice + +We recommend keeping this toggle off unless you specifically need memory context with a particular third-party model. + +--- + +## Technical Deep Dive + +### Hybrid Search Algorithm + +Memoria uses a sophisticated hybrid search combining: + +1. **Dense vector search**: FAISS-based similarity search using compressed int8 vectors +2. **Sparse BM25-style search**: Keyword matching for precise term recall +3. **Reciprocal Rank Fusion (RRF)**: Combines both approaches with adaptive weighting + +The search dynamically adjusts its strategy based on your query: + +- Short queries (≤4 words) favor keyword matching +- Longer queries favor semantic similarity +- Strong keyword matches boost the sparse component + +### Memory Extraction + +Every 3rd assistant response, Memoria automatically extracts insights: + +1. Collects the last 5 user messages and 5 assistant messages +2. Sends to an extraction model with a specialized prompt +3. Receives a summary and importance score +4. Filters out unimportant memories (importance > 8) +5. Stores the memory with a salted embedding + +### Compression & Efficiency + +To minimize storage impact: + +- Embeddings are quantized from float32 to int8 (~75% reduction) +- Embeddings are base64 encoded for storage (~67% total savings) +- Each memory uses approximately 1.4KB of storage +- The browser quota for IndexedDB is typically 100MB-1GB, supporting thousands of memories + +--- + +## Privacy Summary + +| Aspect | Status | +|--------|--------| +| Memory storage location | Your browser only (IndexedDB) | +| Server-side memory storage | None | +| Cross-device sync | Not supported | +| Encryption | Salted embeddings with user-specific keys | +| Third-party model access | Off by default, user-controlled | +| Backup inclusion | Not yet supported | +| Data portability | Browser-specific | + +--- + +## Conclusion + +Memoria represents Venice's commitment to AI capabilities without compromising privacy. By keeping your memories local and using cryptographic techniques to protect even the mathematical representations of your data, we've built a system that gives you the benefits of persistent AI memory while maintaining the privacy principles Venice was founded on. + +Have questions or feedback about Memoria? Join the conversation in our community channels.