Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions improvements/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,80 @@ These files are intended to be used as the basis for creating individual GitHub
| Dark/Light Theme Toggle | Done |
| Keyboard Shortcuts (Ctrl+S) | Done |

---

## V2 Architecture — Next-Gen Agent Capabilities

Professional detailed templates for the V2 architecture: memory, multi-agent, predictive systems, self-learning, security, integrations, and adaptive UI.

### Block 1 — Memory System (Foundation)

| # | File | Area | Complexity |
|---|------|------|-----------|
| V2-01 | [v2-01-semantic-vector-memory.md](v2-01-semantic-vector-memory.md) | Semantic Vector Memory with Embeddings | Medium |
| V2-02 | [v2-02-associative-memory-graph.md](v2-02-associative-memory-graph.md) | Associative Graph-Based Memory | High |
| V2-03 | [v2-03-memory-prioritization-engine.md](v2-03-memory-prioritization-engine.md) | Importance-Based Memory Retention | Medium |

### Block 2 — Predictive Intelligence

| # | File | Area | Complexity |
|---|------|------|-----------|
| V2-04 | [v2-04-prediction-engine.md](v2-04-prediction-engine.md) | Prediction Engine for Next User Actions | High |
| V2-05 | [v2-05-predictive-caching.md](v2-05-predictive-caching.md) | Predictive Caching Layer | Medium |
| V2-06 | [v2-06-anomaly-detection.md](v2-06-anomaly-detection.md) | Anomaly Detection for Unusual Behavior | Medium |

### Block 3 — Multi-Agent System

| # | File | Area | Complexity |
|---|------|------|-----------|
| V2-07 | [v2-07-agent-registry.md](v2-07-agent-registry.md) | Agent Registry and Roles | Very High |
| V2-08 | [v2-08-task-delegation.md](v2-08-task-delegation.md) | Automatic Task Delegation System | Very High |
| V2-09 | [v2-09-pipeline-execution.md](v2-09-pipeline-execution.md) | Pipeline-Based Task Execution | Very High |
| V2-10 | [v2-10-self-correcting-loop.md](v2-10-self-correcting-loop.md) | Self-Correcting Execution Loop | High |

### Block 4 — Time Intelligence

| # | File | Area | Complexity |
|---|------|------|-----------|
| V2-11 | [v2-11-temporal-context.md](v2-11-temporal-context.md) | Time-Aware Context System | Medium |
| V2-12 | [v2-12-predictive-scheduling.md](v2-12-predictive-scheduling.md) | Smart Task Scheduling | High |

### Block 5 — Security Layer

| # | File | Area | Complexity |
|---|------|------|-----------|
| V2-13 | [v2-13-zero-trust-execution.md](v2-13-zero-trust-execution.md) | Zero-Trust Validation for Actions | High |
| V2-14 | [v2-14-audit-trail.md](v2-14-audit-trail.md) | Full Audit Logs for Agent Decisions | High |

### Block 6 — Integrations

| # | File | Area | Complexity |
|---|------|------|-----------|
| V2-15 | [v2-15-unified-integration-layer.md](v2-15-unified-integration-layer.md) | Unified API Layer for External Services | High |
| V2-16 | [v2-16-webhooks-event-bus.md](v2-16-webhooks-event-bus.md) | Event-Driven Architecture | High |

### Block 7 — Generative UI

| # | File | Area | Complexity |
|---|------|------|-----------|
| V2-17 | [v2-17-dynamic-dashboard.md](v2-17-dynamic-dashboard.md) | Dynamic Dashboard Generation | High |
| V2-18 | [v2-18-ai-widget-generator.md](v2-18-ai-widget-generator.md) | Auto-Generated Widgets Based on Usage | High |

### Block 8 — Self-Improvement

| # | File | Area | Complexity |
|---|------|------|-----------|
| V2-19 | [v2-19-feedback-learning.md](v2-19-feedback-learning.md) | Feedback-Based Learning Loop | High |
| V2-20 | [v2-20-adaptive-prompting.md](v2-20-adaptive-prompting.md) | Dynamic Prompt Optimization | Very High |

### Block 9 — Agent Network (Advanced / Optional)

| # | File | Area | Complexity |
|---|------|------|-----------|
| V2-21 | [v2-21-multi-agent-network.md](v2-21-multi-agent-network.md) | Cross-Agent Communication Protocol | Very High |

---

## Complexity Legend

- **Low** — 1-2 days, minimal backend changes
Expand Down
88 changes: 88 additions & 0 deletions improvements/v2-01-semantic-vector-memory.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Semantic Vector Memory

## Current State

The agent uses a SQLite-backed memory system (`src/memory/`) for storing conversation context. The existing `Memory.tsx` page provides basic memory browsing. Embeddings support exists via `sqlite-vec` (upgraded to `^0.1.7` stable in PR #86), but there is no semantic search API exposed to users or to the agent itself for retrieval-augmented generation.

## Problem

- Memory retrieval is keyword-based or by exact ID — no "search by meaning"
- The agent cannot recall contextually similar past conversations or tool results
- Users cannot search memory using natural language queries
- Relevant past context is lost unless explicitly referenced
- No way to surface related tasks or outcomes from prior sessions

## What to Implement

### 1. Vector Storage Layer
- **Backend**: Extend SQLite with `sqlite-vec` to store embedding vectors alongside memory entries
- **Schema**: `memory_vectors (id, memory_id FK, embedding BLOB, model TEXT, created_at)`
- **Embedding models**: Support OpenAI `text-embedding-3-small` and local alternatives (e.g., `@xenova/transformers`)
- **Storage targets**:
- Conversation messages (user + assistant turns)
- Task descriptions and outcomes
- Tool invocation results (summarized)

### 2. Semantic Search API
- **Endpoint**: `GET /api/memory/search?q=<natural language query>&limit=10&threshold=0.7`
- **Flow**:
1. Embed the query string
2. Perform cosine similarity search against stored vectors
3. Return ranked results with similarity scores
- **Endpoint**: `GET /api/memory/related/:id` — find memories semantically related to a given memory entry

### 3. Agent Context Integration
- **Auto-retrieval**: Before each LLM call, retrieve top-K relevant memories based on the current conversation context
- **Injection**: Append retrieved memories as a "relevant context" section in the system prompt
- **Configurable**: Enable/disable via `config.yaml``memory.semantic_search.enabled: true`
- **Token budget**: Configurable max tokens for injected context (default: 1000)

### 4. Memory Indexing Pipeline
- **On-write**: When a new memory entry is created, compute and store its embedding asynchronously
- **Batch reindex**: `POST /api/memory/reindex` — recompute all embeddings (for model changes)
- **Progress tracking**: Reindex job with status endpoint `GET /api/memory/reindex/status`

### 5. Web UI Enhancements
- **Location**: Enhance existing `Memory.tsx` page
- **Features**:
- Semantic search bar with natural language input
- "Similar memories" sidebar when viewing a memory entry
- Visual similarity scores on search results

### Backend Architecture
- `src/memory/vector-store.ts` — vector storage and retrieval using `sqlite-vec`
- `src/memory/embeddings.ts` — embedding computation (provider-agnostic)
- `src/memory/semantic-search.ts` — search orchestration, ranking, filtering
- `src/webui/routes/memory.ts` — extend with search endpoints

### Implementation Steps

1. Create `vector-store.ts` with `sqlite-vec` integration for insert/query
2. Create `embeddings.ts` with provider abstraction (OpenAI, local)
3. Add `memory_vectors` table migration
4. Create semantic search service with cosine similarity ranking
5. Add `/api/memory/search` and `/api/memory/related/:id` endpoints
6. Integrate auto-retrieval into `src/agent/runtime.ts` before LLM calls
7. Add reindex pipeline with job status tracking
8. Enhance `Memory.tsx` with semantic search UI
9. Add configuration options to `config.yaml`

### Files to Modify
- `src/memory/` — new files for vector store, embeddings, semantic search
- `src/webui/routes/memory.ts` — add search endpoints
- `src/agent/runtime.ts` — integrate semantic context retrieval
- `web/src/pages/Memory.tsx` — add search UI
- `web/src/lib/api.ts` — add memory search API calls
- `config.example.yaml` — add semantic search config section

### Acceptance Criteria
- Search by meaning, not keywords — "what did we discuss about performance?" returns relevant results
- API: `/api/memory/search?q=...` returns ranked results with similarity scores
- Works with existing agent context pipeline
- Configurable embedding provider and token budget

### Notes
- **Medium complexity**`sqlite-vec` is already a dependency, main work is the search/indexing pipeline
- Embedding computation adds latency; run asynchronously and cache aggressively
- Consider chunking long texts before embedding (max ~512 tokens per chunk)
- Rate-limit embedding API calls to avoid cost spikes during bulk reindex
87 changes: 87 additions & 0 deletions improvements/v2-02-associative-memory-graph.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Associative Memory Graph

## Current State

Memory entries are stored as flat rows in SQLite. Relationships between entities (tasks, tools, conversations, outcomes) are implicit — buried in conversation text rather than explicitly modeled. There is no way to traverse connections like "which tools were used for task X?" or "what conversations led to outcome Y?"

## Problem

- No explicit relationships between memory entities
- Cannot answer "what tools were used when we discussed topic X?"
- Cannot trace decision chains: task → tool → outcome → follow-up
- Related context is scattered across separate memory entries with no links
- The agent cannot reason about connections between past interactions

## What to Implement

### 1. Graph Schema
- **Nodes**: Entities extracted from agent interactions
- `conversations` — individual sessions/threads
- `tasks` — user-requested tasks and their outcomes
- `tools` — tool invocations with parameters and results
- `topics` — extracted topics/themes from conversations
- `entities` — named entities (people, projects, URLs, etc.)
- **Edges**: Typed relationships
- `conversation → USED_TOOL → tool`
- `task → PRODUCED → outcome`
- `conversation → ABOUT → topic`
- `task → RELATED_TO → task`
- `entity → MENTIONED_IN → conversation`
- **Storage**: SQLite tables `graph_nodes (id, type, label, metadata JSON, created_at)` and `graph_edges (id, source_id, target_id, relation, weight, created_at)`

### 2. Entity Extraction Pipeline
- **On each agent turn**: Extract entities and relationships using LLM-based extraction
- **Extraction prompt**: Structured output requesting entities, types, and relationships
- **Fallback**: Regex-based extraction for common patterns (URLs, @mentions, dates)
- **Deduplication**: Fuzzy matching to avoid duplicate nodes for the same entity

### 3. Graph Query API
- `GET /api/memory/graph/nodes?type=tool&q=search` — list/search nodes
- `GET /api/memory/graph/node/:id/related?depth=2` — traverse relationships up to N hops
- `GET /api/memory/graph/path?from=:id&to=:id` — find shortest path between nodes
- `GET /api/memory/graph/context?task_id=:id` — get full context graph for a task

### 4. Agent Context Enrichment
- When processing a new message, query the graph for related context
- Combine with semantic vector search (v2-01) for hybrid retrieval
- Provide the agent with structured relationship context, not just raw text

### 5. Graph Visualization UI
- **Location**: New tab on `Memory.tsx` page — "Knowledge Graph"
- **Library**: [react-force-graph](https://github.com/vasturiano/react-force-graph) or D3.js force layout
- **Features**:
- Interactive node-link diagram
- Filter by node type and relationship type
- Click node to see details and connected entities
- Search and highlight paths

### Backend Architecture
- `src/memory/graph-store.ts` — CRUD for nodes and edges
- `src/memory/entity-extractor.ts` — LLM-based entity/relationship extraction
- `src/memory/graph-query.ts` — traversal and path-finding algorithms
- `src/webui/routes/graph.ts` — API endpoints

### Implementation Steps

1. Design and create `graph_nodes` and `graph_edges` SQLite tables
2. Implement `graph-store.ts` with node/edge CRUD operations
3. Implement `entity-extractor.ts` with LLM-based extraction
4. Hook extraction into `src/agent/runtime.ts` post-response pipeline
5. Implement graph query service with traversal algorithms
6. Add API endpoints for graph queries
7. Create graph visualization component in `web/src/components/`
8. Add "Knowledge Graph" tab to `Memory.tsx`

### Files to Modify
- `src/memory/` — new files for graph store, extraction, queries
- `src/agent/runtime.ts` — hook entity extraction into post-response pipeline
- `src/webui/routes/` — add graph API routes
- `web/src/pages/Memory.tsx` — add graph visualization tab
- `web/package.json` — add graph visualization library

### Notes
- **High complexity** — requires entity extraction pipeline and graph algorithms
- Entity extraction via LLM adds cost per message; consider batch extraction or extracting only on "interesting" turns
- Graph size will grow; implement pagination and limit traversal depth
- Consider using the graph to enhance the semantic search from v2-01 (hybrid retrieval)
- Start with simple relationship types; extend as patterns emerge from real usage
85 changes: 85 additions & 0 deletions improvements/v2-03-memory-prioritization-engine.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Memory Prioritization Engine

## Current State

All memory entries have equal weight. There is no mechanism to determine which memories are important and should be retained vs. which are stale or irrelevant. Over time, the memory store grows unboundedly, making retrieval slower and context injection noisier.

## Problem

- All memories are treated equally regardless of relevance or freshness
- No automatic cleanup of stale or low-value data
- Context injection (v2-01) has no quality signal for ranking
- Storage grows without bounds, degrading search performance
- No way to distinguish between a critical decision and a casual remark

## What to Implement

### 1. Importance Scoring Model
- **Scoring dimensions**:
- **Recency**: Exponential decay — recent memories score higher
- **Frequency**: How often a memory or its entities are referenced
- **Impact**: Did this memory lead to a successful task outcome?
- **Explicit markers**: User-flagged memories ("remember this")
- **Semantic centrality**: How connected is this node in the knowledge graph (v2-02)?
- **Composite score**: Weighted combination of all dimensions, normalized to 0.0–1.0
- **Formula**: `score = w1*recency + w2*frequency + w3*impact + w4*explicit + w5*centrality`
- **Configurable weights**: Via `config.yaml``memory.prioritization.weights`

### 2. Scoring Pipeline
- **On-access**: Bump frequency counter when a memory is retrieved or referenced
- **On-outcome**: When a task completes successfully, boost scores of memories used in its context
- **Periodic**: Background job (configurable interval, default: 1 hour) recalculates composite scores
- **Storage**: `memory_scores (memory_id, score, recency, frequency, impact, explicit, centrality, updated_at)`

### 3. Auto-Cleanup Service
- **Retention policy**: Configurable in `config.yaml`
- `memory.retention.min_score: 0.1` — memories below this threshold are candidates for cleanup
- `memory.retention.max_age_days: 90` — hard limit regardless of score
- `memory.retention.max_entries: 10000` — cap total entries
- **Cleanup flow**: Score → rank → archive (move to `memory_archive` table) → delete after archive period
- **Protection**: Never delete user-flagged or explicitly marked memories
- **Endpoint**: `POST /api/memory/cleanup` — trigger manual cleanup with dry-run option

### 4. Priority-Aware Retrieval
- Integrate scores into semantic search (v2-01) as a ranking boost
- `GET /api/memory/search?q=...&min_score=0.3` — filter by minimum importance
- Context injection uses score to allocate token budget: high-score memories get more space

### 5. Memory Dashboard
- **Location**: Enhance `Memory.tsx` page
- **Features**:
- Score distribution chart (histogram)
- "At risk" memories list (approaching cleanup threshold)
- Manual score adjustment (pin / unpin)
- Cleanup history log
- Storage usage stats

### Backend Architecture
- `src/memory/scoring.ts` — score calculation and update logic
- `src/memory/retention.ts` — cleanup policy evaluation and execution
- `src/memory/scheduler.ts` — periodic scoring and cleanup jobs

### Implementation Steps

1. Design `memory_scores` and `memory_archive` tables
2. Implement scoring model with configurable weights
3. Create scoring pipeline (on-access, on-outcome, periodic)
4. Implement retention policy engine with dry-run support
5. Integrate scores into semantic search ranking
6. Add API endpoints for cleanup and score management
7. Build memory dashboard UI components
8. Add configuration options to `config.yaml`

### Files to Modify
- `src/memory/` — new files for scoring, retention, scheduler
- `src/memory/semantic-search.ts` — integrate score-based ranking
- `src/webui/routes/memory.ts` — add score/cleanup endpoints
- `web/src/pages/Memory.tsx` — add score visualization and management
- `config.example.yaml` — add prioritization and retention config

### Notes
- **Medium complexity** — scoring model is straightforward; scheduling and retention need careful testing
- Cleanup is destructive — archive before deleting, and always support dry-run
- Score recalculation on large memory stores may be slow; use incremental updates where possible
- The scoring weights will need tuning based on real usage patterns
- This feature depends on v2-01 (semantic search) and benefits from v2-02 (graph centrality)
Loading