Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
269 changes: 269 additions & 0 deletions blog/memoria-technical-overview.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,269 @@
---
title: "Memoria: A Technical Overview of Venice's Memory System"
description: "A deep dive into how Venice remembers your conversations while keeping your data private"
date: "2026-01-23"
author: "Venice Team"
tags:
- privacy
- technology
- memoria
- ai
- features
image: null
featured: true
---

# Memoria: A Technical Overview of Venice's Memory System

*A deep dive into how Venice remembers your conversations while keeping your data private*

---

## Introduction

Venice is proud to introduce **Memoria**, our privacy-preserving memory system that enables AI to remember context from your past conversations. Unlike traditional cloud-based memory systems, Memoria stores all data locally in your browser, ensuring that your memories never leave your device.

This technical overview explains how Memoria works, what data it stores, and answers common questions about the feature.

---

## How Memoria Works

### The Core Concept

Memoria uses **vector embeddings** to understand and retrieve relevant memories. When you chat with Venice:

1. **During conversation**: Your messages are converted into mathematical representations (vectors) that capture their meaning
2. **Automatic extraction**: Every few messages, Venice extracts key information and insights from the conversation
3. **Intelligent retrieval**: When you start a new conversation, Memoria searches for relevant past memories and provides them as context to the AI

This creates a more personalized experience where the AI can reference things you've discussed before, remember your preferences, and build on previous conversations.

### Architecture Overview

```
┌─────────────────────────────────────────────────────────────────────┐
│ Your Browser │
├─────────────────────────────────────────────────────────────────────┤
│ IndexedDB (Local Storage) │
│ ├── Memory records (text summaries) │
│ ├── Vector embeddings (1024-dimensional, compressed) │
│ ├── Sparse search tokens │
│ └── Metadata (timestamps, sources, importance scores) │
├─────────────────────────────────────────────────────────────────────┤
│ FAISS WASM Vector Index │
│ └── In-memory index for fast similarity search │
└─────────────────────────────────────────────────────────────────────┘
│ Search queries
┌─────────────────────────────────────────────────────────────────────┐
│ Venice Servers (Stateless) │
├─────────────────────────────────────────────────────────────────────┤
│ • Generate embeddings for search queries │
│ • Extract memory summaries from conversations │
│ • No memory storage - all processing is transient │
└─────────────────────────────────────────────────────────────────────┘
```

---

## Data Storage & Privacy

### What Data Is Stored?

Memoria stores the following data locally in your browser:

| Data Type | Description | Example |
|-----------|-------------|---------|
| **Memory Text** | Summary of conversation insights | "User is learning Python programming" |
| **Vector Embedding** | 1024-dimensional mathematical representation | Compressed to ~1.4KB per memory |
| **Sparse Tokens** | Keywords for hybrid search | ["python", "programming", "learning"] |
| **Source** | Where the memory came from | "venice" (auto-extracted), "manual", or file identifier |
| **Memory Type** | Classification of the memory | "extracted_summary", "user" |
| **Importance Score** | 1-10 rating of relevance | Lower = more important |
| **Timestamps** | Creation and access times | ISO date strings |

### Privacy Guarantees

1. **Local-only storage**: All memory data is stored in IndexedDB within your browser. It never leaves your device unless explicitly shared.

2. **Vector salting**: Even the mathematical representations of your memories are transformed using a user-specific cryptographic salt derived from your encryption key. This means:
- Your embeddings are unique to you
- They cannot be correlated with other users' data
- Even if intercepted, they cannot be reverse-engineered

3. **No server-side storage**: Venice servers only generate embeddings transiently. They do not store your memories or the content used to create them.

4. **Anonymized model protection**: You can disable memory sharing with third-party ("anonymized") models that may have different privacy guarantees than Venice's private models.

---

## Chat Memory vs. Character Memory

Memoria provides two separate memory pools:

### Chat Memory

- Used in regular conversations (non-character chats)
- Enabled via Settings → Memory → Chat Memory toggle
- Memories are scoped to a special "chat_memory" identifier
- Ideal for general knowledge about you, your preferences, and ongoing projects

### Character Memory

- Used when chatting with AI characters
- Each character has their own isolated memory pool
- Memories are scoped by character ID
- Enables characters to "remember" your relationship and past conversations

**Important**: Chat Memory and Character Memory are completely separate. Memories from regular chats won't appear in character conversations, and vice versa.

---

## Document Uploads

You can enhance memory by uploading documents (PDF or text files):

### How Document Processing Works

1. **Text extraction**: Documents are parsed to extract readable text
2. **Chunking**: Large documents are split into overlapping segments (~1200 characters with 200-character overlap)
3. **Embedding**: Each chunk is converted to a vector embedding
4. **Storage**: Chunks are stored with a unique source identifier derived from the filename

### Using Document Memories

- **Enable/Disable per document**: You can toggle individual documents on/off without deleting them
- **Source filtering**: Disabled documents are excluded from memory searches
- **File limits**: Up to 50 documents for Chat Memory, 15 per character

### Best Practices for Documents

- Upload reference material you want the AI to remember
- Use clear, descriptive filenames (the filename is included in the first chunk)
- Supported formats: PDF, plain text, and other text-based files
- Password-protected PDFs are not supported

---

## Frequently Asked Questions

### Does disabling Chat Memory delete all my memories?

**No.** Disabling Chat Memory only stops the AI from accessing and creating new memories. Your existing memories remain stored in your browser and will be available again if you re-enable the feature.

To actually delete memories, you must explicitly delete them through:
- Settings → Memory → Delete individual memories
- Settings → Memory → Delete all (for interaction memories)
- Settings → Memory → Delete all documents (for uploaded files)

### Does clearing browser data / logging out delete my memories?

**Yes, potentially.** Since memories are stored in IndexedDB (browser local storage):

- **Clearing site data for venice.ai** will delete all memories
- **Clearing all browser data** will delete all memories
- **Logging out** does not delete memories (they remain in IndexedDB)
- **Using a different browser or device** means you won't have access to memories stored elsewhere

**Recommendation**: Consider your memories as browser-specific. If you use multiple devices, each will have its own independent memory store.

### Does restoring from a backup restore my memories?

**No, not currently.** The backup/restore system backs up your conversations, characters, and settings, but **memories are not included** in backups.

This is a known limitation. Memories are stored in a separate database structure optimized for vector search, and backup integration is planned for a future release.

### How do I best use this feature?

**For Chat Memory:**

1. Enable Chat Memory in Settings → Memory
2. Optionally disable "Auto-generate memories" if you prefer manual control
3. Add important facts manually using the "Add Memory" button
4. Upload reference documents you want the AI to remember
5. Periodically review and clean up irrelevant memories

**For Character Memory:**

1. Enable Character Memory in the character's settings
2. Use the "Extraction prompt" field to customize what the AI remembers about conversations
3. Upload character-specific documents (lore, backstory, reference material)
4. Review memories in the character's Memory tab

**Pro tips:**

- Memories work best when they're concise and factual
- The AI retrieves the most relevant memories based on your current message
- You can edit memories to correct or refine them
- Toggle off documents temporarily rather than deleting if you might need them later

### What data is shared with anonymized (third-party) models?

When using models not hosted directly by Venice (marked as "Anonymized"):

- **By default**: Memories are NOT shared with these models
- **If enabled**: The "Share memories with anonymized models" toggle allows memory context to be sent
- **Privacy note**: Third-party providers may have different data retention policies than Venice

We recommend keeping this toggle off unless you specifically need memory context with a particular third-party model.

---

## Technical Deep Dive

### Hybrid Search Algorithm

Memoria uses a sophisticated hybrid search combining:

1. **Dense vector search**: FAISS-based similarity search using compressed int8 vectors
2. **Sparse BM25-style search**: Keyword matching for precise term recall
3. **Reciprocal Rank Fusion (RRF)**: Combines both approaches with adaptive weighting

The search dynamically adjusts its strategy based on your query:

- Short queries (≤4 words) favor keyword matching
- Longer queries favor semantic similarity
- Strong keyword matches boost the sparse component

### Memory Extraction

Every 3rd assistant response, Memoria automatically extracts insights:

1. Collects the last 5 user messages and 5 assistant messages
2. Sends to an extraction model with a specialized prompt
3. Receives a summary and importance score
4. Filters out unimportant memories (importance > 8)
5. Stores the memory with a salted embedding

### Compression & Efficiency

To minimize storage impact:

- Embeddings are quantized from float32 to int8 (~75% reduction)
- Embeddings are base64 encoded for storage (~67% total savings)
- Each memory uses approximately 1.4KB of storage
- The browser quota for IndexedDB is typically 100MB-1GB, supporting thousands of memories

---

## Privacy Summary

| Aspect | Status |
|--------|--------|
| Memory storage location | Your browser only (IndexedDB) |
| Server-side memory storage | None |
| Cross-device sync | Not supported |
| Encryption | Salted embeddings with user-specific keys |
| Third-party model access | Off by default, user-controlled |
| Backup inclusion | Not yet supported |
| Data portability | Browser-specific |

---

## Conclusion

Memoria represents Venice's commitment to AI capabilities without compromising privacy. By keeping your memories local and using cryptographic techniques to protect even the mathematical representations of your data, we've built a system that gives you the benefits of persistent AI memory while maintaining the privacy principles Venice was founded on.

Have questions or feedback about Memoria? Join the conversation in our community channels.