diff --git a/INVESTIGATION_BY_CLAUDE.md b/INVESTIGATION_BY_CLAUDE.md
new file mode 100644
index 0000000..b0601e0
--- /dev/null
+++ b/INVESTIGATION_BY_CLAUDE.md
@@ -0,0 +1,1071 @@
+# Codebase Investigation Report
+
+**Investigation Date:** November 13, 2025
+**Investigator:** Claude (Sonnet 4.5)
+**Repository:** Memori - SQL-Native Memory Engine for AI Agents
+
+---
+
+## 1. How Does the System Define Memory?
+
+Memory in this system is not the traditional computer science definition. It refers to conversational context storage for AI agents.
+
+### Three-Tier Storage Model
+
+**Long-Term Memory** (`long_term_memory` table)
+- All conversations are stored here after processing
+- Enhanced with metadata: classification, importance scores, entity extraction
+- Contains flags for conscious context detection (user preferences, skills, project info)
+- Never expires unless explicitly deleted
+- SQLAlchemy model: `LongTermMemory` in `memori/database/models.py:109`
+
+**Short-Term Memory** (`short_term_memory` table)
+- Working memory injected into LLM context window
+- Can be temporary (7-day expiration) or permanent
+- Populated two ways:
+  1. Conscious mode: Promoted from long-term at startup
+  2. Auto mode: Dynamically searched and injected per query
+- SQLAlchemy model: `ShortTermMemory` in `memori/database/models.py:63`
+
+**Chat History** (`chat_history` table)
+- Raw conversation logs (user input + AI output)
+- Tracks tokens used, model, session metadata
+- Source material for memory extraction
+- SQLAlchemy model: `ChatHistory` in `memori/database/models.py:28`
+
+### Memory Classification System
+
+The system uses a hierarchy defined in Pydantic models (`memori/utils/pydantic_models.py:22`):
+
+**Classification Types:**
+- `ESSENTIAL` - Core facts, preferences, skills
+- `CONTEXTUAL` - Project context, ongoing work
+- `CONVERSATIONAL` - Regular chat discussions
+- `REFERENCE` - Code examples, technical docs
+- `PERSONAL` - User details, relationships
+- `CONSCIOUS_INFO` - Auto-promoted to short-term context
+
+**Importance Levels:**
+- `CRITICAL` - Must never be lost
+- `HIGH` - Very important for context
+- `MEDIUM` - Useful to remember
+- `LOW` - Nice to have context
+
+**Category Types:**
+- `fact` - Factual information, definitions
+- `preference` - User preferences, likes/dislikes
+- `skill` - Skills, abilities, expertise
+- `context` - Project context, environment
+- `rule` - Rules, policies, procedures
+
+### How Memory is Extracted
+
+An OpenAI-powered agent (`MemoryAgent` in `memori/agents/memory_agent.py`) processes each conversation using structured outputs:
+
+1. Takes raw user input + AI output
+2. Extracts entities (people, technologies, topics, skills, projects)
+3. Assigns classification, importance, and scores
+4. Detects duplicates and relationships
+5. Identifies conscious context (user info that should be immediately available)
+6. Returns a `ProcessedLongTermMemory` Pydantic model
+
+The extraction uses a 73+ line system prompt with detailed classification rules.
+
+---
+
+## 2. Where is This Supposed to Be Used?
+
+### Primary Use Case: AI Agent Memory
+
+This system gives stateless LLMs (OpenAI, Anthropic, etc.) the ability to remember past conversations. Without this, every chat is blank slate.
+
+### Target Deployments
+
+**Individual Developers:**
+- SQLite local file for personal AI assistants
+- No infrastructure needed
+- Example: `database_connect="sqlite:///memori.db"`
+
+**Production AI Applications:**
+- PostgreSQL/MySQL for multi-user systems
+- Multi-tenant isolation via `user_id`, `assistant_id`, `session_id`
+- Deployed on: Neon, Supabase, AWS RDS, Azure Database
+
+**Multi-Agent Systems:**
+- CrewAI, AutoGen, LangChain integrations (15+ examples in repo)
+- Shared memory across agent swarms
+- Agent-specific memory via `assistant_id` parameter
+
+### Integration Points
+
+**Interception Mode (Zero-Code):**
+```python
+memori = Memori(database_connect="...", conscious_ingest=True)
+memori.enable()  # Hooks into OpenAI/Anthropic/LiteLLM calls
+
+client = OpenAI()  # Standard client usage
+# Memori intercepts all calls automatically
+```
+
+**Callback Mode (LiteLLM Native):**
+- Registers as LiteLLM success callback
+- Works with 100+ LLM providers via LiteLLM
+- Automatically records conversations
+
+**Wrapper Mode:**
+```python
+from memori import MemoriOpenAI
+client = MemoriOpenAI(api_key="...", memori=memori_instance)
+```
+
+### What It's NOT For
+
+- Not a vector database replacement (though 80-90% cheaper)
+- Not for real-time streaming context (works on completed exchanges)
+- Not for file/document storage (conversation-focused)
+- Not a RAG system (no document chunking/embedding built-in)
+
+---
+
+## 3. What is the Architecture of the System?
+
+### High-Level Design
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                    User Application                      │
+│  (OpenAI client, Anthropic client, LiteLLM, etc.)       │
+└────────────────────┬────────────────────────────────────┘
+                     │
+                     ↓
+┌─────────────────────────────────────────────────────────┐
+│                  Memori Core Layer                       │
+│  ┌──────────────────────────────────────────────────┐   │
+│  │  Interception Layer (memory.py)                   │   │
+│  │  - OpenAI method patching                         │   │
+│  │  - Anthropic wrapper                              │   │
+│  │  - LiteLLM callbacks                              │   │
+│  └──────────────────────────────────────────────────┘   │
+│                     │                                    │
+│                     ↓                                    │
+│  ┌──────────────────────────────────────────────────┐   │
+│  │  Context Injection Logic                          │   │
+│  │  - Conscious mode: One-shot at startup            │   │
+│  │  - Auto mode: Per-query search                    │   │
+│  └──────────────────────────────────────────────────┘   │
+│                     │                                    │
+│                     ↓                                    │
+│  ┌──────────────────────────────────────────────────┐   │
+│  │  Agent System                                     │   │
+│  │  - MemoryAgent: Extracts structured data          │   │
+│  │  - MemorySearchEngine: Plans searches             │   │
+│  │  - ConsciouscAgent: Promotes memories             │   │
+│  └──────────────────────────────────────────────────┘   │
+└────────────────────┬────────────────────────────────────┘
+                     │
+                     ↓
+┌─────────────────────────────────────────────────────────┐
+│              Database Abstraction Layer                  │
+│  ┌──────────────────────────────────────────────────┐   │
+│  │  SQLAlchemyDatabaseManager                        │   │
+│  │  - ORM models (ChatHistory, ShortTermMemory, etc) │   │
+│  │  - Cross-database compatibility                   │   │
+│  │  - Connection pooling                             │   │
+│  └──────────────────────────────────────────────────┘   │
+│  ┌──────────────────────────────────────────────────┐   │
+│  │  SearchService                                    │   │
+│  │  - FTS5 (SQLite)                                  │   │
+│  │  - PostgreSQL FTS                                 │   │
+│  │  - MySQL FULLTEXT                                 │   │
+│  │  - MongoDB text search                            │   │
+│  └──────────────────────────────────────────────────┘   │
+└────────────────────┬────────────────────────────────────┘
+                     │
+                     ↓
+┌─────────────────────────────────────────────────────────┐
+│         Database Layer (User Controlled)                 │
+│    SQLite │ PostgreSQL │ MySQL │ MongoDB                │
+└─────────────────────────────────────────────────────────┘
+```
+
+### Key Architectural Patterns
+
+**1. Interception Pattern**
+- Patches OpenAI SDK methods (`openai.OpenAI.chat.completions.create`)
+- Uses context variables (`ContextVar`) for thread-safe multi-tenant isolation
+- Implemented in `memori/integrations/openai_integration.py`
+
+**2. Observer Pattern (LiteLLM Callbacks)**
+- Registers as LiteLLM success callback
+- Observes all LLM completions across 100+ providers
+- Callback location: `memori/core/memory.py:_setup_litellm_callbacks`
+
+**3. Strategy Pattern (Memory Injection)**
+- Two strategies: conscious_ingest (one-shot) vs auto_ingest (dynamic)
+- Switchable at initialization
+- Logic in `memori/core/memory.py:_inject_*_context` methods
+
+**4. Database Abstraction (SQLAlchemy ORM)**
+- Single model definition works across SQLite, PostgreSQL, MySQL
+- Database-specific features added via dialect detection
+- Models: `memori/database/models.py`
+
+**5. Agent-Based Processing**
+- Uses OpenAI structured outputs (Pydantic models)
+- Three specialized agents for different tasks
+- Agent implementations in `memori/agents/`
+
+### Component Breakdown
+
+**Core Module (`memori/core/`):**
+- `memory.py` (3,052 lines) - Main `Memori` class, orchestration logic
+- `conversation.py` - Conversation session management
+- `providers.py` - Provider configuration (OpenAI, Azure, custom endpoints)
+
+**Database Module (`memori/database/`):**
+- `sqlalchemy_manager.py` - SQL database management
+- `mongodb_manager.py` - MongoDB-specific manager
+- `models.py` - SQLAlchemy ORM definitions
+- `search_service.py` (1,470 lines) - Cross-database search
+- `adapters/` - Database-specific implementations
+- `connectors/` - Database drivers
+- `migrations/` - Schema migrations
+
+**Agents Module (`memori/agents/`):**
+- `memory_agent.py` - Conversation processing with OpenAI structured outputs
+- `conscious_agent.py` - Memory promotion to short-term
+- `retrieval_agent.py` - Search query planning
+
+**Config Module (`memori/config/`):**
+- `settings.py` - Pydantic settings models
+- `manager.py` - Singleton config manager
+- `memory_manager.py` - Recording orchestration
+
+**Integrations Module (`memori/integrations/`):**
+- `openai_integration.py` - OpenAI SDK interception
+- `anthropic_integration.py` - Anthropic wrapper
+- `litellm_integration.py` - LiteLLM callbacks
+
+---
+
+## 4. How Does the System Manage State and Data?
+
+### State Management
+
+**Thread-Safe Initialization:**
+```python
+# In Memori class (memory.py:162)
+self._conscious_init_lock = threading.RLock()  # Recursive lock
+self._conscious_init_pending = False  # Deferred initialization flag
+```
+
+Conscious agent runs once on first LLM call, not at `__init__`. This prevents blocking during object creation.
+
+**Context Variable for Multi-Tenancy:**
+```python
+# In openai_integration.py
+_active_memori_context: ContextVar[MemoriContext | None]
+```
+
+This ensures each thread/async task gets the correct Memori instance. Prevents context leakage in concurrent web servers.
+
+**Session Management:**
+```python
+# ConversationManager in core/conversation.py
+class ConversationSession:
+    session_id: str
+    messages: list[ConversationMessage]
+    context_injected: bool  # Tracks if context already added
+```
+
+Prevents duplicate context injection in same conversation.
+
+### Data Flow
+
+**Recording Flow (After LLM Response):**
+
+1. LLM completes response
+2. Interception/callback captures: user input, AI output, model, tokens
+3. `MemoryAgent` processes conversation → `ProcessedLongTermMemory` Pydantic model
+4. Data written to `long_term_memory` table with all metadata
+5. If `conscious_ingest=True` and memory is `CONSCIOUS_INFO`, also copied to `short_term_memory`
+6. Chat history written to `chat_history` table
+
+**Retrieval Flow (Before LLM Call):**
+
+**Conscious Mode (one-shot at startup):**
+1. `ConsciouscAgent.run_conscious_ingest()` queries long-term memory
+2. Finds memories flagged with `promotion_eligible=True` or classification=`CONSCIOUS_INFO`
+3. Copies to `short_term_memory` as permanent context
+4. On first LLM call, all short-term memories injected into system prompt
+5. No repeated injection in subsequent calls (cached in session state)
+
+**Auto Mode (dynamic per query):**
+1. User sends query
+2. `MemorySearchEngine` analyzes query → generates `MemorySearchQuery` plan
+3. Search executed against long-term memory (FTS + filters)
+4. Top 3-10 relevant memories retrieved
+5. Injected into system prompt for this call only
+6. Process repeats for next query
+
+### Data Persistence
+
+**SQLAlchemy Session Management:**
+```python
+# In sqlalchemy_manager.py
+SessionLocal = sessionmaker(bind=engine)
+session = SessionLocal()
+# Operations
+session.commit()
+session.close()
+```
+
+Standard SQLAlchemy session lifecycle. Each operation opens/closes session.
+
+**Connection Pooling:**
+```python
+engine = create_engine(
+    database_url,
+    pool_size=2,           # Configurable
+    max_overflow=3,        # Extra connections if needed
+    pool_timeout=30,       # Wait time for connection
+    pool_recycle=3600,     # Recycle connections hourly
+    pool_pre_ping=True     # Test before using
+)
+```
+
+Configured in `Memori.__init__` parameters.
+
+**Multi-Tenant Isolation:**
+
+Every query filtered by:
+```python
+user_id = "default"        # Primary isolation
+assistant_id = None        # Optional bot-specific
+session_id = "default"     # Conversation grouping
+```
+
+Queries always include: `WHERE user_id = ? [AND assistant_id = ?]`
+
+No cross-tenant data leakage possible at query level.
+
+### State Transitions
+
+**Memory Lifecycle:**
+
+```
+1. User/AI Exchange
+   ↓
+2. ChatHistory (raw storage)
+   ↓
+3. MemoryAgent Processing
+   ↓
+4. LongTermMemory (with metadata)
+   ↓
+5a. [If CONSCIOUS_INFO] → ShortTermMemory (permanent)
+   OR
+5b. [If high importance] → Promotion eligible flag set
+   OR
+5c. [If auto_ingest] → Retrieved on matching queries
+   ↓
+6. [Optional] Expires from ShortTermMemory after ~7 days (if not permanent)
+```
+
+**Processing Status Flags:**
+```python
+# In LongTermMemory model
+processed_for_duplicates = Column(Boolean, default=False)
+conscious_processed = Column(Boolean, default=False)
+```
+
+These prevent re-processing same memories.
+
+### Concurrency Control
+
+**Optimistic Locking (Planned, Not Implemented):**
+```python
+# In LongTermMemory model (models.py:162)
+version = Column(Integer, nullable=False, default=1)
+# TODO: Implement optimistic locking logic using this column
+# Currently unused - planned for future enhancement
+```
+
+Currently no conflict resolution. Concurrent updates would cause race conditions. This is a known gap.
+
+---
+
+## 5. What is the Data Model?
+
+### Database Schema
+
+**Three Core Tables:**
+
+**`chat_history`:**
+```sql
+chat_id VARCHAR(255) PRIMARY KEY
+user_input TEXT NOT NULL
+ai_output TEXT NOT NULL
+model VARCHAR(255) NOT NULL
+session_id VARCHAR(255) NOT NULL
+tokens_used INTEGER DEFAULT 0
+metadata_json JSON
+user_id VARCHAR(255) NOT NULL DEFAULT 'default'
+assistant_id VARCHAR(255) NULL
+created_at DATETIME NOT NULL
+updated_at DATETIME
+
+Indexes:
+- idx_chat_user_id (user_id)
+- idx_chat_user_assistant (user_id, assistant_id)
+- idx_chat_created (created_at)
+- idx_chat_model (model)
+```
+
+**`short_term_memory`:**
+```sql
+memory_id VARCHAR(255) PRIMARY KEY
+chat_id VARCHAR(255) FOREIGN KEY → chat_history.chat_id
+processed_data JSON NOT NULL
+importance_score FLOAT NOT NULL DEFAULT 0.5
+category_primary VARCHAR(255) NOT NULL
+retention_type VARCHAR(50) NOT NULL DEFAULT 'short_term'
+user_id VARCHAR(255) NOT NULL DEFAULT 'default'
+assistant_id VARCHAR(255) NULL
+session_id VARCHAR(255) NOT NULL DEFAULT 'default'
+created_at DATETIME NOT NULL
+expires_at DATETIME NULL  -- NULL = permanent
+searchable_content TEXT NOT NULL
+summary TEXT NOT NULL
+is_permanent_context BOOLEAN DEFAULT FALSE
+
+Indexes: 8 total (user_id, category, importance, expires_at, etc.)
+```
+
+**`long_term_memory`:**
+```sql
+memory_id VARCHAR(255) PRIMARY KEY
+processed_data JSON NOT NULL  -- Full ProcessedLongTermMemory as JSON
+importance_score FLOAT NOT NULL DEFAULT 0.5
+category_primary VARCHAR(255) NOT NULL
+retention_type VARCHAR(50) NOT NULL DEFAULT 'long_term'
+user_id VARCHAR(255) NOT NULL DEFAULT 'default'
+assistant_id VARCHAR(255) NULL
+session_id VARCHAR(255) NOT NULL DEFAULT 'default'
+created_at DATETIME NOT NULL
+searchable_content TEXT NOT NULL
+summary TEXT NOT NULL
+
+-- Scoring fields
+novelty_score FLOAT DEFAULT 0.5
+relevance_score FLOAT DEFAULT 0.5
+actionability_score FLOAT DEFAULT 0.5
+
+-- Classification fields
+classification VARCHAR(50) NOT NULL DEFAULT 'conversational'
+memory_importance VARCHAR(20) NOT NULL DEFAULT 'medium'
+topic VARCHAR(255) NULL
+entities_json JSON NULL
+keywords_json JSON NULL
+
+-- Conscious context flags
+is_user_context BOOLEAN DEFAULT FALSE
+is_preference BOOLEAN DEFAULT FALSE
+is_skill_knowledge BOOLEAN DEFAULT FALSE
+is_current_project BOOLEAN DEFAULT FALSE
+promotion_eligible BOOLEAN DEFAULT FALSE
+
+-- Deduplication fields
+duplicate_of VARCHAR(255) NULL
+supersedes_json JSON NULL
+related_memories_json JSON NULL
+
+-- Metadata
+confidence_score FLOAT DEFAULT 0.8
+classification_reason TEXT NULL
+processed_for_duplicates BOOLEAN DEFAULT FALSE
+conscious_processed BOOLEAN DEFAULT FALSE
+version INTEGER NOT NULL DEFAULT 1  -- For optimistic locking (unused)
+
+Indexes: 11 total (20+ composite indexes for query optimization)
+```
+
+### Full-Text Search Tables
+
+**SQLite (FTS5):**
+```sql
+CREATE VIRTUAL TABLE memory_search_fts USING fts5(
+    memory_id,
+    memory_type,
+    user_id,
+    searchable_content,
+    summary,
+    category_primary,
+    content='',
+    contentless_delete=1
+);
+
+-- Triggers maintain FTS index on inserts/deletes
+```
+
+**PostgreSQL:**
+```sql
+ALTER TABLE short_term_memory ADD COLUMN search_vector tsvector;
+ALTER TABLE long_term_memory ADD COLUMN search_vector tsvector;
+
+CREATE INDEX idx_short_term_search_vector ON short_term_memory USING GIN(search_vector);
+
+-- Triggers auto-update tsvector on insert/update
+```
+
+**MySQL:**
+```sql
+ALTER TABLE short_term_memory ADD FULLTEXT INDEX ft_short_term_search (searchable_content, summary);
+ALTER TABLE long_term_memory ADD FULLTEXT INDEX ft_long_term_search (searchable_content, summary);
+```
+
+### Pydantic Models (Application Layer)
+
+**`ProcessedLongTermMemory` (primary data structure):**
+```python
+class ProcessedLongTermMemory(BaseModel):
+    content: str                                  # Actual memory
+    summary: str                                  # Search-optimized summary
+    classification: MemoryClassification          # Enum
+    importance: MemoryImportanceLevel             # Enum
+    topic: str | None
+    entities: list[str]
+    keywords: list[str]
+
+    # Conscious flags
+    is_user_context: bool
+    is_preference: bool
+    is_skill_knowledge: bool
+    is_current_project: bool
+
+    # Deduplication
+    duplicate_of: str | None
+    supersedes: list[str]
+    related_memories: list[str]
+
+    # Metadata
+    session_id: str
+    confidence_score: float = 0.8
+    extraction_timestamp: datetime
+    classification_reason: str
+    promotion_eligible: bool
+```
+
+This Pydantic model is serialized to `processed_data` JSON column in database.
+
+### Entity Extraction Model
+
+```python
+class ExtractedEntities(BaseModel):
+    people: list[str]              # Names mentioned
+    technologies: list[str]        # Tools, libraries, frameworks
+    topics: list[str]              # Main subjects
+    skills: list[str]              # Abilities, competencies
+    projects: list[str]            # Repos, initiatives
+    keywords: list[str]            # Search keywords
+    structured_entities: list[ExtractedEntity]  # With metadata
+```
+
+Stored in `entities_json` column as JSON array.
+
+### Relationships
+
+**Database Foreign Keys:**
+- `short_term_memory.chat_id` → `chat_history.chat_id` (SET NULL on delete)
+
+**Logical Relationships (no enforced FKs):**
+- `duplicate_of` → `memory_id` of original memory
+- `supersedes_json` → list of `memory_id` values this replaces
+- `related_memories_json` → list of related `memory_id` values
+
+These are managed in application code, not database constraints.
+
+### Indexing Strategy
+
+**20+ indexes optimized for:**
+1. Multi-tenant queries: `(user_id, assistant_id)`
+2. Category filtering: `(user_id, category_primary, importance_score)`
+3. Temporal queries: `(created_at)`, `(expires_at)`
+4. Conscious context: `(is_user_context, is_preference, is_skill_knowledge, promotion_eligible)`
+5. Full-text search: FTS5/tsvector/FULLTEXT per database
+6. Optimistic locking: `(memory_id, version)` (unused currently)
+
+The indexing is aggressive - prioritizes read performance over write speed. Appropriate for memory system where reads (context retrieval) far exceed writes.
+
+---
+
+## 6. Stack Analysis (Systems Thinking + Formal Design)
+
+### Technology Stack
+
+**Language & Runtime:**
+- Python 3.10+ (required)
+- Async support (asyncio for agent operations)
+
+**Core Dependencies:**
+- **Pydantic 2.0+** - Data validation, structured LLM outputs
+- **SQLAlchemy 2.0+** - ORM, cross-database compatibility
+- **OpenAI SDK 1.0+** - LLM API client, structured outputs
+- **LiteLLM 1.0+** - Universal LLM provider abstraction
+- **Loguru** - Structured logging
+
+**Database Drivers:**
+- sqlite3 (built-in)
+- psycopg2 (PostgreSQL)
+- PyMySQL (MySQL)
+- pymongo (MongoDB)
+
+**Optional Integrations:**
+- anthropic (Anthropic Claude)
+- python-dotenv (environment config)
+
+### System Design Analysis
+
+**Design Philosophy:**
+This is an **interception layer** that sits between user code and LLM APIs. It's not a standalone service - it's embedded in the application process.
+
+**Coupling Analysis:**
+
+**Tight Coupling (Concerning):**
+1. **OpenAI SDK Dependency:** Agents require OpenAI for memory processing. If you want to use only Anthropic, you still need OpenAI API key for the `MemoryAgent`. This is a design constraint.
+
+2. **Pydantic Version Lock:** Heavy use of Pydantic 2.0+ features (structured outputs, protected namespaces). Upgrading Pydantic 3.0 could break things.
+
+3. **LiteLLM Callback System:** Auto-ingestion relies on LiteLLM's callback hooks. If LiteLLM changes callback API, this breaks.
+
+4. **Database Schema Rigidity:** Adding new fields requires migrations. The schema is tightly coupled to Pydantic models.
+
+**Loose Coupling (Good):**
+1. **Database Abstraction:** SQLAlchemy ORM allows swapping databases without code changes. Clean abstraction.
+
+2. **Provider Configuration:** `ProviderConfig` abstraction allows OpenAI/Azure/Custom endpoints via single interface.
+
+3. **Search Service Abstraction:** `SearchService` class provides unified search API across FTS5/PostgreSQL/MySQL/MongoDB implementations.
+
+**Cohesion Analysis:**
+
+**High Cohesion (Good):**
+- Each agent has single responsibility (extraction, search, promotion)
+- Database module cleanly separates from core logic
+- Configuration module isolated from business logic
+
+**Low Cohesion (Concerning):**
+- `memory.py` is 3,052 lines doing orchestration, interception, injection, and agent management. This violates single responsibility.
+- Mixed concerns: configuration, state management, and business logic in one class.
+
+### Formal System Properties
+
+**Consistency:**
+- **Issue:** No distributed transaction support. If `chat_history` write succeeds but `long_term_memory` write fails, you have orphaned data.
+- **Issue:** Optimistic locking planned but not implemented. Concurrent updates to same memory will cause lost updates.
+
+**Availability:**
+- **Good:** No external dependencies beyond database. If database is up, system works.
+- **Issue:** Synchronous processing blocks LLM responses. Memory extraction adds latency.
+
+**Partition Tolerance:**
+- **N/A:** Single-process architecture. No distributed system concerns.
+
+**CAP Theorem Assessment:**
+- This is a **CP system** (Consistency + Partition Tolerance), but partitioning isn't relevant.
+- In reality: **CA system** - prioritizes consistency and availability in single-node deployment.
+
+**ACID Properties:**
+
+**Atomicity:**
+- **Partial:** Each database write is atomic, but multi-step processes (chat → long-term → short-term) are not transactional.
+
+**Consistency:**
+- **Good:** Foreign key constraints enforced where present.
+- **Issue:** Logical relationships (`duplicate_of`, `supersedes`) not enforced. Can point to non-existent memory IDs.
+
+**Isolation:**
+- **Good:** SQLAlchemy session isolation per operation.
+- **Issue:** No row-level locking. Concurrent updates cause race conditions.
+
+**Durability:**
+- **Good:** Delegated to underlying database. PostgreSQL/MySQL provide WAL. SQLite provides journaling.
+
+### Scalability Analysis
+
+**Vertical Scaling:**
+- **Good:** Connection pooling supports concurrent requests.
+- **Limit:** Single database bottleneck. All reads/writes hit one database.
+
+**Horizontal Scaling:**
+- **Not Supported:** No sharding, no read replicas, no caching layer.
+- **Multi-Tenancy:** Achieved via `user_id` column, but all tenants on same database.
+
+**Performance Characteristics:**
+
+**Write Path:**
+```
+User query → LLM (300-2000ms)
+           → Memory extraction via MemoryAgent (500-1500ms)
+           → Database write (10-50ms)
+Total added latency: 510-1550ms per conversation
+```
+
+This is **synchronous** and blocks the response. Users wait for memory processing.
+
+**Read Path (Conscious Mode):**
+```
+Startup → ConsciouscAgent queries long-term (50-200ms)
+        → Copies to short-term (10-100ms)
+First LLM call → Injects all short-term memories (0ms, already in memory)
+Total startup cost: 60-300ms (one-time)
+```
+
+**Read Path (Auto Mode):**
+```
+Each LLM call → MemorySearchEngine plans query (300-800ms)
+              → Search service FTS query (20-100ms)
+              → Inject top results (0ms)
+Total added latency: 320-900ms per query
+```
+
+**Bottlenecks:**
+1. **Agent LLM calls:** Every memory extraction requires OpenAI API call (network latency + inference time)
+2. **Full-text search:** Large memories (10k+ entries) will slow FTS queries
+3. **No caching:** Repeated queries re-search database every time
+
+---
+
+## 7. Problems I See
+
+### Critical Issues
+
+**1. Blocking Synchronous Processing**
+- Memory extraction happens in response path
+- User waits 500-1500ms for memory to be processed after LLM responds
+- **Impact:** Poor UX, high latency
+- **Location:** `memory.py:_record_*_conversation` methods
+- **Fix:** Background task queue (Celery, Redis Queue, or simple threading)
+
+**2. OpenAI Dependency for All Memory Processing**
+- Even if you use Anthropic/Ollama for conversations, memory extraction requires OpenAI API key
+- **Impact:** Vendor lock-in, additional cost, single point of failure
+- **Location:** `MemoryAgent.__init__` in `agents/memory_agent.py`
+- **Fix:** Support LiteLLM for agent operations, allow any provider for memory processing
+
+**3. No Optimistic Locking Implementation**
+- Version column exists but unused
+- Concurrent updates will cause lost updates
+- **Impact:** Data corruption in multi-user environments
+- **Location:** `models.py:162` - version column defined but no logic
+- **Fix:** Implement version checking in update operations
+
+**4. No Transaction Management Across Tables**
+- Writing chat → long-term → short-term is multi-step without transactions
+- Partial failures leave orphaned records
+- **Impact:** Data inconsistency
+- **Location:** Throughout `sqlalchemy_manager.py`
+- **Fix:** Wrap multi-table operations in SQLAlchemy transactions
+
+**5. Search Recursion Issue (Known in ROADMAP)**
+- Recursive memory lookups in remote DB environments
+- **Impact:** Infinite loops, performance degradation
+- **Status:** Listed as CRITICAL in `ROADMAP.md:37`
+- **Fix:** Not implemented yet
+
+### Major Design Issues
+
+**6. 3,052-Line God Class**
+- `Memori` class in `memory.py` does everything
+- Violates single responsibility principle
+- **Impact:** Hard to test, maintain, extend
+- **Fix:** Break into smaller classes (Recorder, Injector, Orchestrator)
+
+**7. Duplicate Memory Creation (Known Issue)**
+- Memories appear in both short-term and long-term incorrectly
+- **Impact:** Wasted storage, context pollution
+- **Location:** `ROADMAP.md:36` - Known Issue
+- **Fix:** Not implemented yet
+
+**8. No Caching Layer**
+- Every context injection re-queries database
+- Same user with same context queries repeatedly
+- **Impact:** Unnecessary database load, latency
+- **Fix:** Redis/memcached for short-term memory cache
+
+**9. Aggressive Memory Extraction**
+- Every conversation processed, even trivial ones
+- "Hello" → full entity extraction → database write
+- **Impact:** Wasted API calls, database bloat
+- **Fix:** Filtering logic in `MemoryAgent` to skip trivial conversations
+
+**10. PostgreSQL FTS Issues on Neon (Known Issue)**
+- Partial search failure with full-text search
+- **Impact:** Broken search on popular hosting platform
+- **Location:** `ROADMAP.md:38` - Known Issue
+- **Status:** Inconsistent behavior
+
+### Moderate Issues
+
+**11. Thread Safety Concerns**
+- Context variables used for OpenAI interception, but not fully tested
+- `_conscious_init_lock` is RLock, but initialization logic still has race conditions
+- **Impact:** Potential bugs in high-concurrency environments
+
+**12. No Rate Limiting**
+- Unlimited LLM API calls for memory extraction
+- User could trigger thousands of expensive OpenAI calls
+- **Impact:** Cost explosion, API quota exhaustion
+- **Fix:** Rate limiter in `security/rate_limiter.py` exists but not integrated
+
+**13. No Monitoring/Observability**
+- No metrics exported (memory count, latency, errors)
+- Only logging via Loguru
+- **Impact:** Hard to debug production issues
+- **Fix:** Prometheus metrics, OpenTelemetry integration
+
+**14. Weak Input Validation**
+- User input goes directly to LLM without sanitization
+- SQL injection prevented by ORM, but no XSS protection
+- **Impact:** Potential injection attacks if data displayed in web UI
+- **Location:** `utils/input_validator.py` exists but not consistently used
+
+**15. No Graceful Degradation**
+- If database is down, entire system fails
+- No fallback to in-memory storage
+- **Impact:** Brittle deployment
+- **Fix:** Circuit breaker pattern, in-memory fallback mode
+
+**16. Memory Limits Not Enforced**
+- `max_short_term_memories` and `max_long_term_memories` configured but not enforced in code
+- **Impact:** Unbounded growth, database bloat
+- **Location:** `settings.py` defines limits, but no cleanup logic
+
+**17. Timezone Handling**
+- Uses `datetime.utcnow` (deprecated in Python 3.12+)
+- No timezone awareness in `created_at`, `expires_at` columns
+- **Impact:** Incorrect expiration logic across timezones
+- **Fix:** Use `datetime.now(timezone.utc)` and timezone-aware datetimes
+
+### Minor Issues
+
+**18. Deprecated `namespace` Parameter**
+- Still supported but warns users
+- **Impact:** Confusing API, tech debt
+- **Fix:** Remove in v3.0 as planned
+
+**19. Error Messages Not User-Friendly**
+- Exceptions expose internal details (stack traces, SQL)
+- **Impact:** Poor developer experience
+- **Fix:** Wrap in user-friendly error messages
+
+**20. No Async Database Operations**
+- SQLAlchemy used in sync mode only
+- Async LLM calls but sync database writes
+- **Impact:** Blocking I/O in async contexts
+- **Fix:** Use SQLAlchemy async engine
+
+---
+
+## 8. Well-Designed Parts
+
+### Excellent Design Decisions
+
+**1. SQLAlchemy ORM Abstraction**
+- Single model definition works across SQLite, PostgreSQL, MySQL
+- Database-specific features (FTS) added via dialect detection
+- **Why it's good:** Zero vendor lock-in. Users can start with SQLite, migrate to PostgreSQL with zero code changes.
+- **Location:** `database/models.py`, `database/adapters/`
+
+**2. Pydantic-Based Structured Outputs**
+- Uses OpenAI structured outputs with Pydantic models
+- Type-safe, validated data extraction
+- **Why it's good:** No regex parsing of LLM outputs. Guaranteed schema compliance.
+- **Location:** `utils/pydantic_models.py`, `agents/memory_agent.py`
+
+**3. Database-Specific Search Implementations**
+- FTS5 for SQLite, tsvector for PostgreSQL, FULLTEXT for MySQL
+- Unified `SearchService` API abstracts differences
+- **Why it's good:** Optimal performance per database, clean abstraction
+- **Location:** `database/search_service.py`, `database/search/`
+
+**4. Multi-Tenant Isolation via Columns**
+- `user_id`, `assistant_id`, `session_id` in all tables
+- Query filters automatically applied
+- **Why it's good:** Simple, effective, no complex schema-per-tenant
+- **Location:** All models in `database/models.py`
+
+**5. Conscious Ingest Mode**
+- One-shot working memory injection at startup
+- Mimics human consciousness (permanent context)
+- **Why it's good:** Efficient. Avoids repeated searches. Novel approach to "always-on" context.
+- **Location:** `agents/conscious_agent.py`, `core/memory.py`
+
+**6. LiteLLM Callback Integration**
+- Works with 100+ LLM providers automatically
+- No per-provider integration needed
+- **Why it's good:** Future-proof. New providers supported automatically.
+- **Location:** `integrations/litellm_integration.py`
+
+**7. Provider Configuration Abstraction**
+- `ProviderConfig.from_openai()`, `.from_azure()`, `.from_custom()`
+- Unified interface for different providers
+- **Why it's good:** Clean API, hides complexity of Azure vs OpenAI vs custom endpoints
+- **Location:** `core/providers.py`
+
+**8. Extensive Indexing Strategy**
+- 20+ indexes for query optimization
+- Composite indexes for multi-column filters
+- **Why it's good:** Read-optimized for memory retrieval (primary use case)
+- **Location:** `database/models.py:__table_args__`
+
+**9. Connection Pooling Configuration**
+- Exposed as `Memori.__init__` parameters
+- Configurable pool size, timeout, recycling
+- **Why it's good:** Production-ready. Users can tune for their workload.
+- **Location:** `core/memory.py:78-83`
+
+**10. Memory Classification Hierarchy**
+- Clear categories: fact, preference, skill, context, rule
+- Importance levels: critical, high, medium, low
+- **Why it's good:** Structured memory organization. Queryable by type.
+- **Location:** `utils/pydantic_models.py:12-40`
+
+**11. Deduplication Metadata**
+- `duplicate_of`, `supersedes_json`, `related_memories_json`
+- Tracks memory relationships
+- **Why it's good:** Prevents memory bloat, maintains memory graph
+- **Location:** `database/models.py:147-149`
+
+**12. Comprehensive Examples**
+- 15+ integration examples (CrewAI, LangChain, AutoGen, etc.)
+- Real-world usage patterns documented
+- **Why it's good:** Low barrier to entry. Users can copy-paste.
+- **Location:** `examples/` directory
+
+**13. Expiration Logic for Short-Term Memory**
+- `expires_at` column with NULL = permanent
+- Automatic cleanup possible
+- **Why it's good:** Memory hygiene. Prevents unbounded growth.
+- **Location:** `database/models.py:83`
+
+**14. Entity Extraction Model**
+- Categorized entities: people, technologies, topics, skills, projects
+- Structured with relevance scores
+- **Why it's good:** Rich metadata for advanced queries
+- **Location:** `utils/pydantic_models.py:80-120`
+
+**15. Configuration Management**
+- Singleton `ConfigManager` with multiple sources (env, file, defaults)
+- Source tracking for debugging
+- **Why it's good:** Flexible deployment. 12-factor app compliance.
+- **Location:** `config/manager.py`
+
+### Good Architectural Patterns
+
+**16. Lazy Agent Initialization**
+- Agents imported with try/except fallback
+- Not loaded until needed
+- **Why it's good:** Faster startup, graceful degradation if agents fail to load
+- **Location:** `__init__.py:44-55`
+
+**17. Separation of Concerns (Modules)**
+- Clear module boundaries: core, database, agents, config, integrations
+- Each module has defined responsibility
+- **Why it's good:** Maintainable, testable, follows Python package conventions
+- **Location:** Repository structure
+
+**18. Comprehensive Exception Hierarchy**
+- Custom exceptions: `MemoriError`, `DatabaseError`, `AgentError`, etc.
+- Specific error types for specific failures
+- **Why it's good:** Better error handling, debugging
+- **Location:** `utils/exceptions.py`
+
+**19. Backward Compatibility**
+- Deprecated `namespace` parameter still works with warning
+- Migration path for users
+- **Why it's good:** Doesn't break existing code, guides users to new API
+- **Location:** `core/memory.py:122-134`
+
+**20. Database Auto-Creation**
+- `schema_init=True` creates tables automatically
+- Zero manual setup
+- **Why it's good:** Developer experience. Works out of the box.
+- **Location:** `database/auto_creator.py`
+
+---
+
+## Summary Assessment
+
+### What This System Does Well
+
+1. **Solves a real problem:** LLMs are stateless. This gives them memory.
+2. **User owns the data:** No vendor lock-in. Standard SQL databases.
+3. **Cost-effective:** 80-90% cheaper than vector databases (no embeddings).
+4. **Developer experience:** One-line integration, extensive examples.
+5. **Database flexibility:** Works with SQLite to production PostgreSQL.
+6. **Intelligent classification:** Not just storing text - structured memory with metadata.
+
+### What This System Struggles With
+
+1. **Performance:** Synchronous processing blocks responses. No caching.
+2. **Scalability:** Single database, no sharding, no read replicas.
+3. **Vendor lock-in (OpenAI):** Memory processing requires OpenAI even if you use other LLMs.
+4. **Complexity:** 3,000+ line main class. Hard to maintain.
+5. **Incomplete features:** Optimistic locking defined but not implemented. Known bugs in roadmap.
+6. **Robustness:** No transaction management. No graceful degradation.
+
+### Production Readiness
+
+**Ready for:**
+- Personal projects (SQLite local)
+- Small-scale production (< 1000 users, PostgreSQL)
+- Prototypes and MVPs
+
+**Not ready for:**
+- High-scale production (100k+ users)
+- Real-time/low-latency applications
+- Multi-region deployments
+- Mission-critical systems (no HA, no monitoring)
+
+### Recommendation
+
+This is a **solid MVP** with **excellent core concepts** but **needs hardening for production**.
+
+**Immediate priorities:**
+1. Move memory processing to background tasks (async workers)
+2. Implement caching layer (Redis)
+3. Complete optimistic locking implementation
+4. Add transaction management
+5. Break up god class into smaller components
+6. Support non-OpenAI providers for memory extraction
+
+**Long-term priorities:**
+1. Horizontal scaling support (read replicas, sharding)
+2. Monitoring and observability (metrics, tracing)
+3. Advanced query capabilities (graph traversal, semantic search)
+4. REST API for non-Python languages
+
+---
+
+## Technical Debt Summary
+
+| Category | Severity | Count | Examples |
+|----------|----------|-------|----------|
+| **Performance** | High | 4 | Blocking sync processing, no caching, no background tasks |
+| **Scalability** | High | 3 | Single database bottleneck, no sharding, unbounded memory growth |
+| **Data Integrity** | Critical | 3 | No transactions, no locking, duplicate memory bug |
+| **Vendor Lock-in** | High | 1 | OpenAI required for all memory processing |
+| **Code Quality** | Medium | 2 | 3k-line god class, mixed concerns |
+| **Robustness** | Medium | 5 | No graceful degradation, weak error handling, no monitoring |
+| **Security** | Low | 2 | Inconsistent input validation, no rate limiting integration |
+| **Compatibility** | Low | 2 | Deprecated datetime usage, known Postgres FTS issues |
+
+**Total Issues Identified:** 20 problems across 8 categories
+
+**Total Well-Designed Features:** 20 excellent design decisions
+
+This codebase is **balanced** - strong architectural foundations with significant execution gaps. The vision is clear, implementation needs maturity.
+
+---
+
+*End of Investigation Report*