diff --git a/.gitignore b/.gitignore index 4b06cd5..bf283e8 100644 --- a/.gitignore +++ b/.gitignore @@ -1,5 +1,8 @@ node_modules/ .npm-cache/ +.tmp-npm-cache/ +.audrey-demo-tmp/ +.claude/settings.local.json audrey-data/ .tmp/ .tmp-vitest/ diff --git a/README.md b/README.md index ea421d2..52ad037 100644 --- a/README.md +++ b/README.md @@ -4,76 +4,108 @@ [![npm version](https://img.shields.io/npm/v/audrey.svg)](https://www.npmjs.com/package/audrey) [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE) -Audrey is a persistent memory and continuity engine for Claude Code and AI agents. +Audrey is a local-first memory runtime and continuity engine for AI agents. -It gives an agent a local memory store, durable recall, consolidation, contradiction handling, a REST sidecar, MCP tools, and benchmark gates without adding external infrastructure. +It gives Codex, Claude Code, Claude Desktop, Cursor, local Ollama-backed agents, and custom agent services a shared local memory store, durable recall, consolidation, contradiction handling, a REST sidecar, MCP tools, and benchmark gates without adding external infrastructure. + +Audrey also checks memory before an agent acts. Known failures, project rules, and local quirks become preflight warnings and Memory Reflexes instead of repeated mistakes. Requires Node.js 20+. ## Quick Start -### Claude Code +### 60-Second Proof ```bash -npx audrey init -npx audrey doctor +npx audrey demo ``` -This uses the default `local-offline` preset: +This runs a self-contained local demo with no API keys, no host setup, and no external model. It writes temporary memories, records a redacted tool failure, asks Audrey for a Memory Capsule, proves recall, then deletes the demo store. + +### MCP Hosts + +```bash +npx audrey mcp-config codex +npx audrey mcp-config generic +``` + +`mcp-config codex` prints a ready-to-paste Codex TOML block. `mcp-config generic` prints JSON for local stdio MCP hosts such as Claude Desktop, Cursor, Windsurf, and JetBrains. + +Claude Code also has a direct installer: + +```bash +npx audrey install +claude mcp list +``` + +All MCP paths use local embeddings by default and store memory in one SQLite-backed data directory. + +### Ollama and Local Agents -- registers Audrey with Claude Code -- installs hooks for automatic recall and reflection -- uses local embeddings by default -- stores memory in one local SQLite-backed data directory +Ollama is a local model runtime, not a memory store. Use Audrey as the sidecar memory tool layer for any Ollama-backed agent: + +```bash +AUDREY_AGENT=ollama-local-agent npx audrey serve +curl http://localhost:7437/health +``` + +Then expose Audrey's `/v1/preflight`, `/v1/reflexes`, `/v1/encode`, `/v1/recall`, `/v1/capsule`, and `/v1/status` routes as tools in the local agent loop. + +Runnable example: + +```bash +AUDREY_AGENT=ollama-local-agent npx audrey serve +OLLAMA_MODEL=qwen3 node examples/ollama-memory-agent.js "What should you remember about Audrey?" +``` ### REST or Docker Sidecar ```bash -npx audrey init sidecar-prod docker compose up -d --build ``` Then verify: ```bash -npx audrey doctor +npx audrey status curl http://localhost:3487/health ``` ## Why Audrey - Local-first: memory lives in SQLite with `sqlite-vec`, not a hosted vector database. +- Host-neutral: Audrey is a memory runtime for agent hosts, not a Claude-only extension. - Practical: MCP, CLI, REST, JavaScript, Python, and Docker are all first-class. -- Durable: snapshot, restore, health checks, benchmark gates, and graceful shutdown are built in. +- Durable: export/import, health checks, benchmark gates, and graceful shutdown are built in. - Structured: Audrey does more than save notes. It consolidates, decays, tracks contradictions, and supports procedural memory. ## What Ships -- Claude Code MCP server with 13 memory tools -- Automatic hook-based recall and reflection for Claude Code sessions +- Local stdio MCP server with 19 memory tools +- Ready-to-paste config generation for Codex and generic MCP hosts +- Hook-compatible CLI helpers for recall, reflection, and tool trace capture - JavaScript SDK - Python SDK packaged as `audrey-memory` -- REST API for sidecar deployment +- REST API for sidecar deployment and Ollama/local-agent tool bridges +- Memory Preflight for checking prior failures, risks, rules, and procedures before an agent acts +- Memory Reflexes that convert preflight evidence into trigger-response guidance agents can automate - Docker and Compose deployment path -- Snapshot and restore for portable memory state +- Export/import for portable memory state - Machine-readable health and benchmark gates - Local benchmark harness with retrieval and lifecycle-operation tracks -## Setup Presets +## Integration Modes -`npx audrey init` supports four named presets: - -| Preset | Best For | Behavior | +| Mode | Best For | Entry Point | |---|---|---| -| `local-offline` | Claude Code on one machine | Local embeddings, MCP install, hooks install | -| `hosted-fast` | Claude Code with provider keys already present | Auto-picks hosted providers from env, MCP install, hooks install | -| `ci-mock` | CI and smoke tests | Mock embedding + LLM providers, no Claude-specific setup | -| `sidecar-prod` | REST API and Docker deployment | Sidecar-oriented defaults, no Claude-specific setup | +| MCP stdio | Codex, Claude Code, Claude Desktop, Cursor, Windsurf, VS Code, JetBrains | `npx audrey mcp-config ` or `npx audrey install` for Claude Code | +| REST sidecar | Ollama-backed local agents, internal agent services, Docker | `npx audrey serve` or `docker compose up -d --build` | +| SDK direct | Node.js and TypeScript agents inside one process | `import { Audrey } from 'audrey'` | +| Python client | Python agents calling the REST sidecar | `pip install audrey-memory` | Useful checks: ```bash -npx audrey doctor npx audrey status npx audrey status --json --fail-on-unhealthy ``` @@ -130,32 +162,28 @@ brain.close() ```bash # Setup -npx audrey init -npx audrey init hosted-fast -npx audrey init ci-mock -npx audrey init sidecar-prod +npx audrey demo +npx audrey mcp-config codex +npx audrey mcp-config generic -# Claude Code integration +# MCP integration npx audrey install -npx audrey hooks install -npx audrey hooks uninstall npx audrey uninstall # Health and maintenance -npx audrey doctor npx audrey status npx audrey dream npx audrey reembed - -# Versioning -npx audrey snapshot -npx audrey restore backup.json --force +npx audrey observe-tool --event PostToolUse --tool Bash --outcome failed # Sidecar npx audrey serve +node examples/ollama-memory-agent.js "Use Audrey memory before answering" docker compose up -d --build ``` +Before risky actions, hosts can call `memory_preflight` or `memory_reflexes` over MCP, or `POST /v1/preflight` / `POST /v1/reflexes` over REST. Preflight returns the risk briefing. Reflexes return trigger-response rules such as "Before using npm test, review the prior EPERM failure path." + ## Benchmarks Audrey ships with a benchmark harness and release gate: @@ -210,7 +238,11 @@ Key environment variables: ## Documentation - [docs/benchmarking.md](docs/benchmarking.md) +- [docs/audrey-for-dummies.md](docs/audrey-for-dummies.md) +- [docs/future-of-llm-memory.md](docs/future-of-llm-memory.md) - [docs/production-readiness.md](docs/production-readiness.md) +- [docs/mcp-hosts.md](docs/mcp-hosts.md) +- [docs/ollama-local-agents.md](docs/ollama-local-agents.md) - [CONTRIBUTING.md](CONTRIBUTING.md) - [SECURITY.md](SECURITY.md) diff --git a/codex.md b/codex.md index 68a4363..5cf7ffa 100644 --- a/codex.md +++ b/codex.md @@ -4,7 +4,7 @@ ## What Audrey Is -Audrey is a **biological memory system for AI agents**. It gives agents persistent, local memory that encodes, consolidates, decays, and dreams — modeled after how human brains actually process memory. Published on npm as `audrey` (v0.20.0) and PyPI as `audrey-memory` (v0.20.0). +Audrey is a **biological memory system and local-first continuity runtime for AI agents**. It gives Codex, Claude Code, Claude Desktop, Ollama-backed local agents, and custom agent services persistent local memory that encodes, consolidates, decays, and dreams - modeled after how human brains actually process memory. Published on npm as `audrey` (v0.20.0) and PyPI as `audrey-memory` (v0.20.0). **Not a database.** Not a RAG pipeline. Not a vector store. Audrey is a *memory layer* with biological fidelity: episodic memories consolidate into semantic principles, confidence decays over time, contradictions are tracked and resolved, emotional affect influences recall, and interference between competing memories is modeled explicitly. @@ -41,7 +41,7 @@ audrey/ │ ├── audrey.ts # Main Audrey class — EventEmitter, owns all methods │ ├── index.ts # Barrel re-exports (SDK entry point) │ ├── server.ts # HTTP server (Hono + @hono/node-server) -│ ├── routes.ts # 13 REST endpoints + /health +│ ├── routes.ts # 15 /v1 REST endpoints + /health │ ├── encode.ts # Episode encoding with auto-supersede │ ├── recall.ts # KNN vector recall with 6-signal confidence scoring │ ├── consolidate.ts # Cluster episodes → extract principles (LLM or heuristic) @@ -66,7 +66,7 @@ audrey/ │ ├── ulid.ts # Monotonic ULID generation │ └── utils.ts # Cosine similarity, JSON parse, API key validation ├── mcp-server/ # MCP server + CLI (2 modules) -│ ├── index.ts # 13 MCP tools + CLI (install/uninstall/status/greeting/reflect/dream/reembed/serve) +│ ├── index.ts # 19 MCP tools + CLI (install/uninstall/status/greeting/reflect/dream/reembed/serve) │ └── config.ts # Provider resolution, VERSION constant, install args ├── python-sdk/ # Python SDK (pip install audrey-memory) │ ├── pyproject.toml # Hatchling build, deps: httpx + pydantic @@ -133,10 +133,13 @@ const dream = await brain.dream(); brain.close(); ``` -### 2. MCP Server (Claude Code / Cursor / Windsurf) +### 2. MCP Server (Codex / Claude Code / Claude Desktop / Cursor / Windsurf) ```bash -npx audrey install # registers MCP server with Claude Code +npx audrey demo # self-contained local proof, no keys or host setup +npx audrey mcp-config codex # prints ready-to-paste Codex TOML +npx audrey mcp-config generic # prints JSON for stdio MCP hosts +npx audrey install # registers MCP server with Claude Code npx audrey status # check health npx audrey greeting # session briefing (for hooks) npx audrey reflect # form memories from conversation (for hooks) @@ -144,7 +147,7 @@ npx audrey dream # consolidation + decay cycle npx audrey serve # start HTTP API on port 7437 ``` -13 MCP tools: `memory_encode`, `memory_recall`, `memory_consolidate`, `memory_dream`, `memory_introspect`, `memory_resolve_truth`, `memory_export`, `memory_import`, `memory_forget`, `memory_decay`, `memory_status`, `memory_reflect`, `memory_greeting`. +19 MCP tools: `memory_encode`, `memory_recall`, `memory_consolidate`, `memory_dream`, `memory_introspect`, `memory_resolve_truth`, `memory_export`, `memory_import`, `memory_forget`, `memory_decay`, `memory_status`, `memory_reflect`, `memory_greeting`, `memory_observe_tool`, `memory_recent_failures`, `memory_capsule`, `memory_preflight`, `memory_reflexes`, `memory_promote`. ### 3. HTTP API @@ -161,7 +164,7 @@ curl -X POST http://localhost:7437/v1/recall \ -d '{"query":"test"}' ``` -14 endpoints: `GET /health`, `POST /v1/encode`, `POST /v1/recall`, `POST /v1/consolidate`, `POST /v1/dream`, `GET /v1/introspect`, `POST /v1/resolve-truth`, `GET /v1/export`, `POST /v1/import`, `POST /v1/forget`, `POST /v1/decay`, `GET /v1/status`, `POST /v1/reflect`, `POST /v1/greeting`. +16 `/v1` endpoints plus `GET /health`: `POST /v1/encode`, `POST /v1/recall`, `POST /v1/capsule`, `POST /v1/preflight`, `POST /v1/reflexes`, `POST /v1/consolidate`, `POST /v1/dream`, `GET /v1/introspect`, `POST /v1/resolve-truth`, `GET /v1/export`, `POST /v1/import`, `POST /v1/forget`, `POST /v1/decay`, `GET /v1/status`, `POST /v1/reflect`, `POST /v1/greeting`. ### 4. Python SDK @@ -306,7 +309,7 @@ Schema is in `src/db.ts`. Migrations are in the `MIGRATIONS` array (currently v1 | `AUDREY_DEVICE` | No | `gpu` | Local embedding device (`gpu` or `cpu`) | | `AUDREY_PORT` | No | `7437` | HTTP API server port | | `AUDREY_API_KEY` | No | — | Bearer token for HTTP API auth | -| `AUDREY_AGENT` | No | `claude-code` | Agent name for MCP server | +| `AUDREY_AGENT` | No | `local-agent` | Agent name for MCP server | Auto-detection priority: `GOOGLE_API_KEY` → Gemini embeddings; `ANTHROPIC_API_KEY` → Anthropic LLM; no keys → local embeddings (384d, offline). diff --git a/docs/audrey-for-dummies.md b/docs/audrey-for-dummies.md new file mode 100644 index 0000000..9fd0367 --- /dev/null +++ b/docs/audrey-for-dummies.md @@ -0,0 +1,659 @@ +# Audrey For Dummies + +Date: 2026-04-24 + +This guide explains Audrey in plain language. It assumes you know what an AI assistant is, but not how memory systems work. + +## The One-Sentence Version + +Audrey is a local brain for AI agents. + +It gives tools like Codex, Claude Code, Claude Desktop, Cursor, Ollama agents, and custom apps a shared memory that can remember facts, decisions, procedures, failures, preferences, and project context across sessions. + +## The Problem Audrey Solves + +Most AI agents are powerful but forgetful. + +You can spend an hour teaching an agent how your project works, what failed before, what commands are safe, what your customer cares about, and how you like work done. Then the next session starts and the agent often needs that same context again. + +Large context windows help, but they are not the same as memory. A context window is what the model can see right now. Memory is what the system decides is worth keeping, organizing, updating, recalling, and eventually forgetting. + +Audrey gives agents a durable memory layer so they do not have to start from zero every time. + +## What Audrey Is + +Audrey is: + +- A local-first memory runtime. +- A SQLite-backed memory database. +- A vector-search recall engine. +- A Model Context Protocol server for AI tools. +- A REST API sidecar for local agents and services. +- A JavaScript library. +- A Python client. +- A benchmarked memory system with health checks. + +Audrey is not: + +- A replacement for an LLM. +- A hosted chatbot. +- A vector database only. +- A regulated compliance platform by itself. +- A magic guarantee that an agent will always remember correctly. + +## Why "Local-First" Matters + +Local-first means Audrey can store memory on your machine or inside your deployment boundary instead of forcing you to send memory to a hosted vendor. + +By default, Audrey stores data under: + +```text +C:\Users\\.audrey\data +``` + +You can change that with: + +```bash +AUDREY_DATA_DIR=B:\path\to\audrey-data +``` + +Use one shared data directory when you want multiple hosts to share memory. Use separate directories when you need strict separation by customer, project, environment, or agent. + +## The Basic Loop + +Audrey does seven core things. + +1. Encode memory. +2. Recall memory. +3. Build Memory Capsules. +4. Dream over memory. +5. Track tool traces and failures. +6. Run Memory Preflight before actions. +7. Turn important warnings into Memory Reflexes. + +### 1. Encode Memory + +Encoding means storing something worth remembering. + +Examples: + +- "This repo uses TypeScript ES modules only." +- "On this machine, Vitest can fail with `spawn EPERM`; use build, typecheck, benchmarks, and direct dist smokes as fallback evidence." +- "The customer wants website changes explained in business language, not technical language." +- "Before starting a task, ask Audrey for a Memory Capsule." + +Good memories are durable. They are likely to help again later. + +Bad memories are raw noise. Do not store every sentence of every chat unless you have a clear reason. + +### 2. Recall Memory + +Recall means asking Audrey for memories related to the current task. + +Example: + +```bash +npx audrey +``` + +In MCP hosts, the agent calls tools such as `memory_recall`. In REST mode, local agents call `/v1/recall`. + +### 3. Build Memory Capsules + +A Memory Capsule is a compact task briefing. + +Instead of dumping every matching memory into the model, Audrey groups useful memories into a structured packet with reasons. This is the right shape for agent context. + +Use cases: + +- "What should Codex know before editing this repo?" +- "What should an Ollama agent remember before answering this customer?" +- "What project rules matter before release?" +- "What risks have happened before?" + +REST route: + +```text +POST /v1/capsule +``` + +### 4. Dream Over Memory + +Dreaming is Audrey's maintenance and consolidation step. + +It can: + +- Find patterns. +- Promote repeated lessons into stronger memories. +- Detect contradictions. +- Decay stale memories. +- Consolidate episodes into semantic or procedural knowledge. + +Run it manually: + +```bash +npx audrey dream +``` + +In production, schedule it during low-traffic windows. + +### 5. Track Tool Traces + +Agents do not just chat. They use tools, run commands, edit files, call APIs, and sometimes fail. + +Audrey can remember those tool outcomes. + +Example: + +```bash +npx audrey observe-tool --event PostToolUse --tool Bash --outcome failed +``` + +Why this matters: + +If an agent keeps running into the same environment failure, Audrey can turn that failure into a future warning or procedure. + +### 6. Run Memory Preflight + +Preflight means asking Audrey what the agent should know before it acts. + +Example: + +```text +Before running npm test, check whether this failed before, whether there are release rules, and whether there is a safer known procedure. +``` + +Audrey returns: + +- `decision`: `go`, `caution`, or `block`. +- `risk_score`: how serious the remembered risks are. +- `warnings`: prior failures, must-follow rules, risks, contradictions, or uncertain memories. +- `recommended_actions`: what the agent should do next. +- `evidence_ids`: memories that support the warning. + +### 7. Use Memory Reflexes + +Memory Reflexes are preflight results shaped as trigger-response rules. + +Example: + +```text +Trigger: Before using npm test +Response: Review the prior EPERM failure path before re-running the command. +``` + +This is the product pivot: Audrey is not only a memory store. It is a reflex layer that helps agents stop repeating expensive mistakes. + +REST route: + +```text +POST /v1/reflexes +``` + +## The Fastest Demo + +Run: + +```bash +npx audrey demo +``` + +This does not need API keys, Claude, Codex, Ollama, or any hosted model. + +The demo: + +- Creates a temporary memory store. +- Writes example memories. +- Records a redacted tool failure. +- Builds a Memory Capsule. +- Proves recall. +- Deletes the temporary store unless you pass `--keep`. + +## Three Ways To Use Audrey + +### 1. MCP Mode + +Use this when connecting Audrey to tools that support Model Context Protocol. + +Examples: + +- Codex +- Claude Code +- Claude Desktop +- Cursor +- Windsurf +- VS Code Copilot +- JetBrains AI Assistant + +Generate host config: + +```bash +npx audrey mcp-config codex +npx audrey mcp-config generic +npx audrey mcp-config vscode +``` + +Claude Code has a direct installer: + +```bash +npx audrey install +claude mcp list +``` + +### 2. REST Sidecar Mode + +Use this when building your own local agent, web app, CRM assistant, or Ollama-backed tool loop. + +Start Audrey: + +```bash +npx audrey serve +``` + +Health check: + +```bash +curl http://localhost:7437/health +``` + +Useful routes: + +```text +GET /health +GET /v1/status +POST /v1/encode +POST /v1/recall +POST /v1/capsule +POST /v1/preflight +POST /v1/reflexes +POST /v1/export +POST /v1/import +``` + +### 3. SDK Mode + +Use this when embedding Audrey directly in a Node.js app. + +```js +import { Audrey } from 'audrey'; + +const brain = new Audrey({ + dataDir: './.audrey-data', + agent: 'my-agent', +}); + +await brain.encode({ + content: 'This project prefers ES modules.', + source: 'direct-observation', + tags: ['project-rule'], +}); + +const memories = await brain.recall('project module format', { limit: 3 }); +console.log(memories); + +brain.close(); +``` + +## Ollama And Local Agents + +Ollama runs local models. Audrey gives those local models memory. + +Start Audrey: + +```bash +AUDREY_AGENT=ollama-local-agent npx audrey serve +``` + +Run the example agent: + +```bash +OLLAMA_MODEL=qwen3 node examples/ollama-memory-agent.js "What should you remember about this project?" +``` + +The example uses Ollama tool calling and Audrey REST routes. It exposes Audrey tools for: + +- `memory_preflight` +- `memory_reflexes` +- `memory_capsule` +- `memory_recall` +- `memory_encode` + +## Memory Types + +Audrey stores several kinds of memory. + +### Episodic Memory + +Something that happened. + +Example: + +```text +The release smoke on 2026-04-24 passed build, typecheck, pack dry-run, and the demo command. +``` + +### Semantic Memory + +A general fact or principle. + +Example: + +```text +Audrey is host-neutral and should not be framed as Claude-only. +``` + +### Procedural Memory + +How to do something. + +Example: + +```text +Before calling a release ready, run build, typecheck, benchmark, pack dry-run, and direct CLI smoke. +``` + +### Tool Trace Memory + +What happened when a tool ran. + +Example: + +```text +npm test failed with spawn EPERM on a locked-down Windows host. +``` + +## Memory Metadata + +A memory is more useful when it has metadata. + +Important fields: + +- `source`: where the memory came from. +- `tags`: searchable labels. +- `salience`: importance. +- `context`: project, task, customer, host, or environment. +- `affect`: emotional or urgency signal. +- `private`: whether it should be excluded from public recall results. + +Example encode body: + +```json +{ + "content": "Use npm run typecheck before claiming TypeScript changes are safe.", + "source": "direct-observation", + "tags": ["procedure", "release-gate"], + "salience": 0.8, + "context": { + "repo": "audrey", + "host": "codex" + } +} +``` + +## Beginner Rules For Good Memory + +Use these rules when deciding what Audrey should remember. + +- Store lessons that will matter again. +- Store procedures, not just facts. +- Store failures that should not be repeated. +- Store user preferences when they affect future work. +- Store project conventions. +- Store business context that saves explanation later. +- Do not store raw secrets, API keys, passwords, or private customer data unless your deployment is designed for it. +- Do not blindly store everything. +- Prefer short, clear memories over giant pasted transcripts. +- Add tags. +- Run `npx audrey status` when recall seems wrong. + +## Command Cheat Sheet + +```bash +# Run the local proof demo +npx audrey demo + +# Print Codex MCP config +npx audrey mcp-config codex + +# Print generic MCP JSON +npx audrey mcp-config generic + +# Install into Claude Code +npx audrey install + +# Remove from Claude Code +npx audrey uninstall + +# Start REST sidecar +npx audrey serve + +# Check memory health +npx audrey status +npx audrey status --json --fail-on-unhealthy + +# Consolidate memory +npx audrey dream + +# Repair vector/index drift +npx audrey reembed + +# Record a tool result +npx audrey observe-tool --event PostToolUse --tool Bash --outcome failed +``` + +## HTTP Examples + +Start the server: + +```bash +npx audrey serve +``` + +Encode a memory: + +```bash +curl -X POST http://localhost:7437/v1/encode ^ + -H "Content-Type: application/json" ^ + -d "{\"content\":\"Audrey should work across Codex, Claude, and Ollama.\",\"source\":\"direct-observation\",\"tags\":[\"host-neutral\"]}" +``` + +Recall memory: + +```bash +curl -X POST http://localhost:7437/v1/recall ^ + -H "Content-Type: application/json" ^ + -d "{\"query\":\"host neutral Audrey\",\"limit\":5}" +``` + +Build a Memory Capsule: + +```bash +curl -X POST http://localhost:7437/v1/capsule ^ + -H "Content-Type: application/json" ^ + -d "{\"query\":\"How should an agent use Audrey before starting work?\",\"budget_chars\":3000}" +``` + +PowerShell equivalent: + +```powershell +Invoke-RestMethod -Method Post -Uri http://localhost:7437/v1/capsule ` + -ContentType 'application/json' ` + -Body '{"query":"How should an agent use Audrey before starting work?","budget_chars":3000}' +``` + +Run Memory Preflight: + +```powershell +Invoke-RestMethod -Method Post -Uri http://localhost:7437/v1/preflight ` + -ContentType 'application/json' ` + -Body '{"action":"run npm test before release","tool":"npm test","include_capsule":false}' +``` + +## Production Basics + +For real deployments: + +- Pin `AUDREY_EMBEDDING_PROVIDER`. +- Pin `AUDREY_LLM_PROVIDER` if using LLM-backed consolidation. +- Set a dedicated `AUDREY_DATA_DIR`. +- Use one data directory per tenant boundary. +- Set `AUDREY_API_KEY` before exposing REST beyond localhost. +- Run `npx audrey status --json --fail-on-unhealthy` in health checks. +- Schedule `npx audrey dream`. +- Backup the data directory before migrations or provider changes. +- Keep secrets out of memory. +- Put encryption, access control, and audit logging around Audrey at the host layer. + +## Small Business Use Cases + +Audrey is especially practical for small businesses because their operational knowledge is usually scattered across the owner, a few employees, emails, spreadsheets, website notes, CRM records, and repeated manual fixes. + +### Website Optimization + +Audrey can remember: + +- What the business sells. +- Which pages convert. +- Which SEO changes were already tried. +- Which technical issues recur. +- The owner's tone and brand preferences. + +### CRM Assistant + +Audrey can remember: + +- Customer preferences. +- Follow-up rules. +- Common objections. +- Deal stage quirks. +- Which fields matter in the CRM. + +### Support Agent + +Audrey can remember: + +- Recurring customer issues. +- Approved response patterns. +- Escalation rules. +- Past fixes. +- Product or service constraints. + +### Internal Operations + +Audrey can remember: + +- How invoices are handled. +- Which vendor has special terms. +- How reports are generated. +- What failed during the last migration. +- Which automations are safe to run. + +## Troubleshooting + +### `npx audrey demo` Fails + +Run: + +```bash +npx audrey status +node --version +``` + +Audrey requires Node.js 20 or newer. + +### Codex Or Claude Cannot Find Audrey + +Generate a pinned config: + +```bash +npx audrey mcp-config codex +npx audrey mcp-config generic +``` + +If a Windows MCP host cannot find `npx`, use `cmd /c npx -y audrey` in the host config. + +### Recall Returns Nothing + +Check health: + +```bash +npx audrey status --json --fail-on-unhealthy +``` + +If the embedding dimensions changed, run: + +```bash +npx audrey reembed +``` + +### Local Embeddings Are Slow + +The local embedding provider may download or initialize model assets. For quick CI or demos, use mock providers. For production, pin the provider explicitly. + +### REST Returns Unauthorized + +If `AUDREY_API_KEY` is set, requests need: + +```text +Authorization: Bearer +``` + +### Tests Fail With `spawn EPERM` + +On some locked-down Windows hosts, Vitest/Vite worker startup can fail with `spawn EPERM`. Treat that as a local execution blocker. Use build, typecheck, benchmark checks, package dry-run, and direct Node smokes as fallback evidence. + +## Glossary + +### Agent + +An AI system that can take actions, use tools, or work across steps. + +### MCP + +Model Context Protocol. A standard way for AI tools to call external tools and access resources. + +### REST Sidecar + +A local HTTP service that another app or agent can call. + +### Embedding + +A numeric representation of text used for similarity search. + +### Vector Search + +Searching by meaning instead of exact words. + +### Memory Capsule + +A compact briefing of memories relevant to a task. + +### Dream + +Audrey's consolidation and maintenance cycle. + +### Tool Trace + +A record of what happened when an agent used a tool. + +### Re-Embedding + +Rebuilding vector indexes when the embedding provider or dimensions change. + +## The Mental Model + +Think of Audrey like a project notebook that AI agents can read and update, except it is structured, searchable, local, and designed for automation. + +The best use is not "remember everything." + +The best use is: + +> Remember the lessons, preferences, procedures, and failures that make the next session better than the last one. + +## Where To Go Next + +- Run `npx audrey demo`. +- Read `docs/mcp-hosts.md` to connect Codex, Claude, Cursor, Windsurf, VS Code, or JetBrains. +- Read `docs/ollama-local-agents.md` for local Ollama-backed agents. +- Read `docs/production-readiness.md` before using Audrey in a real deployment. +- Read `docs/future-of-llm-memory.md` for the forward-looking product roadmap. diff --git a/docs/future-of-llm-memory.md b/docs/future-of-llm-memory.md new file mode 100644 index 0000000..70d7b8e --- /dev/null +++ b/docs/future-of-llm-memory.md @@ -0,0 +1,452 @@ +# The Future of LLM Memory + +Date: 2026-04-24 +Audience: Audrey product strategy, technical roadmap, launch content + +## Thesis + +The next serious AI platform will not win because it has the longest context window. It will win because it remembers the right things, forgets the wrong things, proves why a memory matters, and carries learned behavior across tools, hosts, teams, and time. + +The market has already accepted memory as a product category. Claude has project-scoped memory and Managed Agents memory. ChatGPT has saved memories and chat-history reference. Gemini has saved info and past-chat reference. Letta, Mem0, Zep/Graphiti, MemOS, and MIRIX all point toward the same conclusion: stateless agents are not enough. + +Audrey's opening is not "we also have memory." The opening is: + +> Audrey is the local-first memory control plane for every agent you run. + +That means Audrey should become the inspectable, portable, host-neutral layer that turns work into durable memory, and memory into better behavior. + +## What The Field Already Has + +### Platform Memory + +Claude introduced memory for teams and projects, with optional controls, editable summaries, incognito chats, and project separation. Anthropic also announced built-in memory for Claude Managed Agents on April 23, 2026, with filesystem-backed memories, exports, API management, audit logs, rollback, scoped stores, and multi-agent sharing. + +OpenAI's ChatGPT memory exposes saved memories, chat-history reference, temporary chats, deletion controls, memory prioritization, and memory history/restore controls for supported plans. + +Google Gemini has saved info and can reference past chats in supported accounts and contexts. + +The user-facing lesson: users now expect assistants to remember. + +The product gap: these memories are mostly locked inside each platform. + +### Agent Framework Memory + +Letta frames agents as stateful systems with memory blocks, archival memory, messages, tools, runs, and shared blocks. Mem0 focuses on scalable extraction and retrieval for production agents. Zep/Graphiti uses temporal knowledge graphs to track changing entity relationships. MemOS frames memory as an OS-managed resource with provenance and versioning. MIRIX uses multiple memory types and a multi-agent controller, including multimodal screen memory. + +The infrastructure lesson: memory is becoming its own layer. + +The product gap: no simple open standard lets a normal developer connect Codex, Claude Code, Claude Desktop, Cursor, Ollama, and internal agents to one controllable local memory runtime. + +### Benchmarks + +LoCoMo tests very long, multimodal conversations across sessions, temporal event graphs, and causal consistency. LongMemEval tests information extraction, multi-session reasoning, temporal reasoning, knowledge updates, and abstention. Mem0's public benchmark work emphasizes token efficiency, latency, and cost, not just raw accuracy. New 2026 benchmarks like MemoryCD and Mem2ActBench push the field toward cross-domain lifelong personalization and memory-driven tool action. + +The benchmark lesson: "it remembered a fact" is too shallow. + +The product gap: operators need memory tests and regression gates they can run before trusting an agent with real workflows. + +## What Humans Have Not Really Done Yet + +This section is not claiming nobody has written a paper or prototype. It means these ideas are not yet common, packaged, trusted, and easy enough for normal teams to use. + +### 1. A User-Owned Memory Passport + +Current memory is platform-bound. ChatGPT remembers inside ChatGPT. Claude remembers inside Claude. A local Ollama agent remembers only if a developer builds memory for it. + +Audrey can turn memory into a portable "passport": + +- One user or team memory store that travels across Codex, Claude, Ollama, IDEs, and internal agents. +- Export/import as a first-class workflow, not an afterthought. +- Host-specific agent identities layered on top of shared memory. +- A visible "what this agent knows about me and this project" control panel. + +Feature candidate: + +- `npx audrey passport export` +- `npx audrey passport import` +- `npx audrey passport inspect --agent codex` +- `npx audrey passport diff --host claude-code --host codex` + +Why it could be viral: + +People are already frustrated that each AI starts over. "Bring your AI memory with you" is instantly understandable. + +### 2. Git For Memory + +Claude Managed Agents now highlight file-backed memories, audit logs, rollback, and redaction. That is a strong signal. But the broader agent ecosystem still lacks a developer-native model for memory branching and review. + +Audrey can make memory feel like git: + +- Diff memories before and after an agent session. +- Commit a memory state before risky work. +- Branch memory per project or customer. +- Merge lessons from one agent into another. +- Roll back bad memories without destroying the whole store. +- Review memory writes like code review. + +Feature candidate: + +- `npx audrey memory diff` +- `npx audrey memory commit -m "learn Windows EPERM workaround"` +- `npx audrey memory branch customer-acme` +- `npx audrey memory rollback ` + +Why it matters: + +The more powerful memory gets, the more teams need change control. + +### 3. Memory As A Preflight Safety System + +Most memory systems retrieve context after the user asks a question. The bigger opportunity is to use memory before the agent acts. + +Audrey already has the seed of this with tool-trace memory. Repeated failures should become warnings before the agent retries the same risky operation. + +Now shipping as the first concrete slice: + +- MCP tool `memory_preflight`. +- MCP tool `memory_reflexes`. +- REST route `POST /v1/preflight`. +- REST route `POST /v1/reflexes`. +- SDK method `audrey.preflight(action, options)`. +- SDK method `audrey.reflexes(action, options)`. +- Response fields for decision, risk score, warnings, recommendations, evidence IDs, health, recent failures, optional event recording, and optional capsule context. +- Reflex reports that convert warnings into trigger-response rules an agent can automate before tool use. + +Follow-on candidates: + +- Before shell commands: "Have we broken this repo with this command before?" +- Before migrations: "Does memory say this environment lacks `wmic` or blocks temp writes?" +- Before package publishing: "What are the known release gates for this repo?" +- Before editing config: "Has this host config path been stale or write-protected?" + +Why it could be viral: + +A demo where Audrey stops an agent from repeating a known failure is more compelling than a chatbot remembering a favorite color. + +### 4. Action Memory, Not Just Answer Memory + +Mem2ActBench explicitly calls out a gap: benchmarks often test passive fact retrieval, while real agents need to apply memory to select tools and ground parameters. This is the difference between "I remember your CRM is HubSpot" and "I will call the HubSpot tool with the right pipeline, property names, and customer scope because that is how your business works." + +Audrey should feature action memory: + +- Tool preferences. +- Environment quirks. +- Known-safe command patterns. +- API parameter conventions. +- Repeated manual fixes. +- Customer-specific operational workflows. + +Feature candidate: + +- `memory_procedure_suggest` +- `memory_preflight` +- `memory_reflexes` +- `memory_action_context` +- "Before using this tool, Audrey recommends..." + +Why it matters: + +Small businesses do not need AI that remembers trivia. They need AI that remembers how work actually gets done. + +### 5. Memory Regression Tests + +Memory can silently get worse. A new embedding provider, schema migration, pruning rule, or prompt change can cause an agent to forget the exact thing that made it valuable. + +Audrey should treat memory like tested infrastructure: + +- Memory fixtures. +- Recall assertions. +- Capsule assertions. +- "Should not recall" tests for privacy and stale facts. +- Budget tests for token cost. +- Temporal tests for changed facts. + +Feature candidate: + +```bash +npx audrey eval add "What is the deploy command?" --expect "npm run deploy" +npx audrey eval run +npx audrey eval ci --fail-under 0.90 +``` + +Why it matters: + +This turns Audrey from a feature into infrastructure that teams can trust. + +### 6. Permission-Aware Shared Memory For Teams + +Research on collaborative memory is moving toward multi-user, multi-agent memory with dynamic access control, provenance, private fragments, shared fragments, and time-varying permissions. + +Audrey's local-first story should include this, especially for small businesses: + +- Owner memory. +- Employee memory. +- Customer memory. +- Project memory. +- Agent memory. +- Read/write scopes by host and role. + +Feature candidate: + +- `audrey://scopes` +- `memory_share --scope team --redact private` +- `memory_policy test --agent codex --user owner` + +Why it matters: + +Shared memory without access control becomes a liability. Access control without usability becomes shelfware. + +### 7. A Preference Model That Learns From Weak Feedback + +The 2026 VARS paper argues that agents need persistent user models and can learn retrieval preferences from weak scalar feedback, not just explicit "remember this" commands. + +Audrey can support this without fine-tuning: + +- Track when retrieved memories helped. +- Track when a user corrected the agent. +- Boost memories that reduce retries or shorten sessions. +- Decay memories that repeatedly fail to help. +- Separate long-term preferences from session-specific context. + +Feature candidate: + +- `memory_feedback` +- `memory_recall --learn-from-outcome` +- `memory_status --preference-drift` + +Why it matters: + +Good memory is not only what was said. It is what repeatedly proved useful. + +### 8. Temporal Truth, Not Flat Facts + +Zep/Graphiti's temporal graph work is a strong signal: real memory needs to know when a fact was true, what replaced it, and what evidence supports it. + +Audrey should make this obvious: + +- "Customer uses Stripe" may be true until they migrate to Square. +- "Run tests with Vitest" may be true in CI but false on a locked-down Windows host. +- "The README says `/docs` exists" may be stale after routes changed. + +Feature candidate: + +- `valid_from`, `valid_until`, `supersedes`, `superseded_by`. +- Memory conflict timelines. +- Capsule sections for "current truth" and "stale but relevant history." + +Why it matters: + +The most dangerous memory is a true fact from the wrong time. + +### 9. Multimodal Operational Memory + +MIRIX shows the importance of multimodal memory, including screenshots and visual context. Most practical agents still remember text far better than UI state, screenshots, invoices, dashboards, browser flows, and design assets. + +Audrey could target operational multimodal memory: + +- Website screenshots before and after optimization. +- CRM screenshots and field mappings. +- Error dialogs. +- Browser traces. +- Invoice images and extracted fields. +- Design screenshots tied to implementation notes. + +Feature candidate: + +- `memory_encode_asset` +- `memory_recall_assets` +- `audrey://recent-screens` +- Visual evidence inside Memory Capsules. + +Why it matters: + +Small-business work is visual and operational, not just chat text. + +### 10. Memory Capsules As A Standard Handoff Artifact + +Audrey's Memory Capsule is the right product surface: a compact, ranked, evidence-backed briefing for a specific task. + +The opportunity is to make capsules portable: + +- A Codex capsule before coding. +- A Claude capsule before planning. +- An Ollama capsule before a local answer. +- A CRM capsule before customer follow-up. +- A support capsule before replying to a ticket. + +Feature candidate: + +- `.audrey/capsules/.md` +- `npx audrey capsule "shipping release" --format md` +- Capsule links in PRs, tickets, and handoff docs. + +Why it could be viral: + +"Paste this Memory Capsule into any LLM and it works like it knows the project" is a simple hook. + +### 11. Memory Economics + +Mem0 is pushing token efficiency as a production concern. Audrey should make memory economics visible: + +- Tokens avoided. +- Repeated user explanations avoided. +- Failures prevented. +- Time saved by not rediscovering setup. +- Cost difference between full-context replay and selective recall. + +Feature candidate: + +- `npx audrey roi` +- `memory_status` with saved-token estimates. +- Tool-trace reports showing prevented repeat failures. + +Why it matters: + +Business buyers need a reason to pay. "Audrey saved 40 minutes and avoided three failed deploys this week" is a reason. + +### 12. Sleep That Produces New Working Knowledge + +Many systems summarize. Humans consolidate. Audrey's "dream" concept is stronger if it becomes visibly useful: + +- Detect repeated failures. +- Turn patterns into procedures. +- Find contradictions. +- Promote stable lessons into rules. +- Archive low-value memories. +- Produce a morning briefing. + +Feature candidate: + +- `npx audrey dream --report` +- `npx audrey promote --target codex-rules` +- "Last night Audrey learned..." + +Why it matters: + +The public understands "AI that dreams on your work and wakes up smarter." The engineering version must stay honest: it is consolidation, contradiction detection, procedural learning, and decay. + +## Audrey's Best Feature Bet + +The single best over-the-top feature now shipping is: + +> Memory Reflexes: before an agent acts, Audrey checks prior memories, tool traces, environment quirks, and project rules, then returns trigger-response guidance the host can automate. + +Why this is the right bet: + +- It uses Audrey's existing differentiators: tool traces, procedural memory, Memory Capsules, confidence, tags, and local host identity. +- It is easy to demonstrate. +- It works across Codex, Claude, and Ollama. +- It solves a real pain: agents repeat mistakes. +- It is more defensible than generic chat memory. + +Demo script: + +1. Run a command that fails on this Windows host because of a known `spawn EPERM`, temp-dir, or config-path issue. +2. Encode the failure through Audrey's tool trace path. +3. Start a new agent session. +4. Ask the agent to run the risky workflow again. +5. Audrey returns a reflex: "Before using npm test, review the prior EPERM failure path." +6. The agent avoids the repeated failure or switches to the known fallback validation path. + +Tagline: + +> Audrey gives AI agents memory before they act. + +## Launch-Ready Content Angles + +### Post 1: "Your AI Has Amnesia" + +Hook: + +Your AI can write code, call tools, browse docs, and deploy software. Then tomorrow it forgets the lesson it learned today. + +Audrey angle: + +Memory should be local, inspectable, portable, and testable. + +### Post 2: "Context Windows Are Not Memory" + +Hook: + +A million-token context window is a bigger backpack. It is not a brain. + +Audrey angle: + +Real memory needs write policy, retrieval policy, forgetting, contradiction handling, source lineage, and regression tests. + +### Post 3: "The Agent Black Box" + +Hook: + +When an AI agent makes a mistake, where does that mistake go? + +Audrey angle: + +Tool traces become procedural memory so agents avoid repeating preventable failures. + +### Post 4: "Bring Your Memory" + +Hook: + +Every AI platform wants to remember you. None of them want your memory to leave. + +Audrey angle: + +Audrey is the local-first memory passport across Codex, Claude, Ollama, and internal agents. + +### Post 5: "The Small Business Brain" + +Hook: + +Every small business has invisible operating knowledge: how quotes are written, which customers need special handling, what breaks on the website, and how the owner likes decisions made. + +Audrey angle: + +Audrey turns that invisible knowledge into a local memory layer for websites, CRMs, support agents, and back-office automation. + +## Near-Term Audrey Roadmap + +### 30 Days + +- Add `npx audrey install --host codex|claude-code|claude-desktop|generic` with dry-run and backups. +- Add `npx audrey capsule "task" --format md|json`. +- Add richer Memory Preflight demos, policy modes, and tool classifiers. +- Add a Memory Capsule file exporter. +- Add docs and demos for "agent avoids repeated failure." + +### 60 Days + +- Add memory diff/checkpoint/rollback commands. +- Add memory eval fixtures and CI gates. +- Add temporal validity fields and supersession UI/API. +- Add first LoCoMo and LongMemEval adapters. +- Add a small-business CRM demo with customer memory, workflow memory, and tool preflight. + +### 90 Days + +- Add permission scopes for shared memory. +- Add feedback learning over recall outcomes. +- Add capsule sharing and signed export bundles. +- Add multimodal asset memory prototype. +- Add dashboard/reporting for ROI, failures prevented, and token budget. + +## References + +- Anthropic, "Bringing memory to Claude" (2025): https://claude.com/blog/memory +- Anthropic, "Built-in memory for Claude Managed Agents" (2026): https://claude.com/blog/claude-managed-agents-memory +- OpenAI Help, "Memory FAQ": https://help.openai.com/en/articles/8590148-memory-faq/ +- Google Gemini Help, "Save info and reference past chats": https://support.google.com/gemini/answer/15637730 +- Letta Docs, "Introduction to Stateful Agents": https://docs.letta.com/guides/core-concepts/stateful-agents +- Mem0, "Memory Evaluation": https://docs.mem0.ai/core-concepts/memory-evaluation +- Chhikara et al., "Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory" (2025): https://arxiv.org/abs/2504.19413 +- Rasmussen et al., "Zep: A Temporal Knowledge Graph Architecture for Agent Memory" (2025): https://arxiv.org/abs/2501.13956 +- Li et al., "MemOS: A Memory OS for AI System" (2025): https://arxiv.org/abs/2507.03724 +- Wang and Chen, "MIRIX: Multi-Agent Memory System for LLM-Based Agents" (2025): https://arxiv.org/abs/2507.07957 +- Wu et al., "LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory" (2025): https://arxiv.org/abs/2410.10813 +- Maharana et al., "Evaluating Very Long-Term Conversational Memory of LLM Agents" (LoCoMo, 2024): https://arxiv.org/abs/2402.17753 +- Hao et al., "User Preference Modeling for Conversational LLM Agents" (2026): https://arxiv.org/abs/2603.20939 +- Zhang et al., "MemoryCD" (2026): https://openreview.net/forum?id=Lpq4aEqvmg +- Rezazadeh et al., "Collaborative Memory" (2026 submission): https://openreview.net/forum?id=pJUQ5YA98Z +- "Mem2ActBench" (2026 submission): https://openreview.net/forum?id=hiRJ90xzJY +- Ollama OpenAI compatibility: https://docs.ollama.com/api/openai-compatibility +- Ollama tool calling: https://docs.ollama.com/capabilities/tool-calling diff --git a/docs/handoffs/audrey-industry-standard-assessment-2026-04-23.md b/docs/handoffs/audrey-industry-standard-assessment-2026-04-23.md new file mode 100644 index 0000000..a71bd59 --- /dev/null +++ b/docs/handoffs/audrey-industry-standard-assessment-2026-04-23.md @@ -0,0 +1,144 @@ +# Audrey Industry Standard Assessment + +Assessment date: 2026-04-23 +Last updated: 2026-04-24 +Branch: `master` +Checkout: `B:\projects\claude\audrey` + +## Product Thesis + +Audrey should not be framed as a Claude Code add-on. Claude Code is one distribution channel. + +The stronger category is: + +**Audrey is the local-first continuity runtime for AI agents.** + +That means Audrey should sit underneath Codex, Claude Code, Claude Desktop, Cursor, Windsurf, VS Code, JetBrains, Ollama-backed local agents, and custom internal agents. The host should be replaceable. Audrey's job is persistent memory, recall, contradiction handling, consolidation, tool-trace learning, and behavior carryover. + +The market is moving this way: + +- Claude now has project-scoped memory with view/edit controls and incognito behavior. +- ChatGPT has saved memories and chat-history memory with user controls. +- Ollama supports local tool calling and OpenAI-compatible local APIs, which means local agents can call Audrey as a tool layer. +- Mem0, Zep/Graphiti, LongMemEval, LoCoMo, and MIRIX all point toward selective, temporal, structured, benchmarked memory rather than "dump all chat history into context." + +Audrey's wedge should be local-first, host-neutral, inspectable memory that turns agent work into reusable behavior before the agent acts. + +The sharper product pivot is: + +**Audrey gives AI agents Memory Reflexes.** + +That means Audrey turns prior failures, rules, host quirks, and procedures into trigger-response guidance such as "Before using npm test, check the last EPERM failure path." This is more commercially legible than generic "LLM memory" because the outcome is that agents stop repeating expensive mistakes. + +## What Changed In This Pass + +- Reframed the README from "Claude Code and AI agents" to "local-first memory runtime for AI agents." +- Added first-class Codex config generation: `npx audrey mcp-config codex`. +- Added generic MCP config generation: `npx audrey mcp-config generic` and host-specific output for VS Code. +- Changed the default MCP agent identity from `claude-code` to `local-agent`; the Claude installer still pins `AUDREY_AGENT=claude-code`. +- Prevented printable MCP configs from emitting provider API keys. +- Added `docs/ollama-local-agents.md` for Ollama/local-agent REST tool-bridge use. +- Added `POST /v1/capsule` so REST sidecar agents can use the same Memory Capsule concept exposed by MCP. +- Added `POST /v1/preflight` so REST sidecar agents can check memory before taking risky actions. +- Added `POST /v1/reflexes` so hosts can receive trigger-response Memory Reflexes derived from preflight evidence. +- Added SDK methods `audrey.preflight(action, options)` and `audrey.reflexes(action, options)`. +- Added MCP tools `memory_preflight` and `memory_reflexes`, bringing the host-facing MCP surface to 19 memory tools. +- Added `npx audrey demo`, a 60-second local proof path that writes temporary memories, records a redacted tool failure, asks for a Memory Capsule, proves recall, and cleans up without requiring API keys or host setup. +- Upgraded `npx audrey demo` so it also prints a Memory Reflex proof from a remembered failed tool trace. +- Added `examples/ollama-memory-agent.js`, a complete Ollama `/api/chat` tool-loop example that uses Audrey's `/v1/reflexes`, `/v1/preflight`, `/v1/capsule`, `/v1/recall`, and `/v1/encode` routes. +- Updated package files so the npm tarball includes the MCP host guide and Ollama guide. +- Removed the accidental self-dependency on `audrey` from package metadata. +- Ignored local `.tmp-npm-cache/` and `.claude/settings.local.json` noise. + +## Current Proof Signals + +These commands passed on this machine: + +- `npm run build` +- `npm run typecheck` +- `npm run bench:memory:check` +- `npm pack --dry-run --cache .\.tmp-npm-cache` +- Direct `mcp-config` smoke for Codex and generic MCP output +- Direct `npx audrey demo` equivalent smoke through `node dist\mcp-server\index.js demo` +- Direct `examples/ollama-memory-agent.js --help` syntax/UX smoke +- Direct HTTP capsule smoke against built `dist/` +- Direct SDK reflex smoke: one remembered `npm test` failure produced `decision=caution`, one reflex, trigger `Before using npm test`, and `response_type=warn`. +- Direct HTTP reflex smoke against `POST /v1/reflexes` with bearer auth returned `status=200`, `decision=caution`, one reflex, and embedded preflight when requested. +- Direct MCP schema smoke confirmed `memory_reflexes` rejects empty actions and accepts `include_preflight`. +- `node dist\mcp-server\index.js status --json --fail-on-unhealthy` +- `python -m unittest discover -s python/tests -v` +- `npm view audrey version --cache .\.tmp-npm-cache` returned `0.20.0` + +Local memory health is green: + +- `healthy=true` +- `episodes=58` +- `vec_episodes=58` +- `schema_version=12` +- `reembed_recommended=false` + +Known local test limitation: + +- `npx vitest run tests/mcp-server.test.js` still fails at startup with `spawn EPERM` from Vite/esbuild in this environment. Treat that as a host execution blocker, not proof of a code regression. CI still needs to be checked separately. + +## Strengths Worth Defending + +- SQLite plus `sqlite-vec` keeps Audrey local-first and easy to ship. +- Memory is richer than RAG: episodic, semantic, procedural, affect, confidence decay, contradictions, causal links, consolidation, forgetting, and tool traces. +- MCP and REST now both expose the critical path for agent hosts. +- The Memory Capsule is the right retrieval product shape: structured, ranked, evidence-backed, and budgeted. +- Memory Reflexes are the clearest product wedge: they repackage evidence as trigger-response behavior agents can automate. +- Tool-trace memory is a differentiated idea: Audrey remembers the work, not just the chat. +- Benchmark instincts are already present, and the local regression gate is green. + +## Release Blockers + +1. Python SDK and TS HTTP server contract drift. + Python integration tests are skipped because `/v1/analytics`, `/v1/mark-used`, and snapshot/restore body contracts do not fully match the server. Fix this before calling Python first-class. + +2. OpenAPI/docs surface is not current in the active `src/routes.ts`. + Older plans mention `/openapi.json` and `/docs`, but the current active server file is plain Hono routes. Either restore OpenAPI for `/v1/capsule`, `/v1/preflight`, and `/v1/reflexes`, or remove the claim everywhere. + +3. Remote MCP is still missing. + ChatGPT-style remote MCP needs a streaming HTTP/SSE deployment story. Local stdio MCP covers Codex/Claude/Desktop IDEs, not ChatGPT remote connectors. + +4. External benchmark credibility is still incomplete. + The internal benchmark is useful as a regression gate, but Audrey needs reproducible LoCoMo and LongMemEval adapters to compete credibly. + +5. Host installers are uneven. + Claude Code has `npx audrey install`; Codex has generated TOML; Claude Desktop has docs; Ollama has REST bridge docs. The next product slice should make this feel like one coherent install story. + +6. Some tests are intentionally skipped. + `multi-agent`, `implicit relevance feedback`, one recall failure test, and one wait-for-idle test are skipped. These are not all release blockers, but they mark unfinished product claims. + +## Highest-Leverage Next Slices + +1. Build the unified host installer. + Add `npx audrey install --host codex|claude-code|claude-desktop|generic` with dry-run support and safe config backup. Keep `mcp-config` as the non-mutating path. + +2. Wire Memory Reflexes into real host hooks. + Codex, Claude Code, and local agents should be able to call `memory_reflexes` automatically before shell commands, file edits, deploys, package publishing, and CRM/customer actions. + +3. Repair Python SDK parity. + Either implement the missing TS HTTP routes or remove unsupported Python methods. Unskip the integration tests only when the contract is real. + +4. Restore official API docs. + Reintroduce `/openapi.json` and `/docs` for the current route set, including `/v1/capsule`, `/v1/preflight`, and `/v1/reflexes`, or stop marketing that surface. + +5. Add an Ollama example agent test. + Initial example exists at `examples/ollama-memory-agent.js`. Next step is a CI-safe mocked Ollama test plus a real local smoke when Ollama is installed. + +6. Build external benchmark adapters. + Start with a small LoCoMo harness, then LongMemEval. Keep the local benchmark labeled as regression-only. + +## Strategic Positioning + +Audrey should sell three outcomes: + +- Agents stop forgetting operational context across tools and hosts. +- Teams can inspect, export, repair, and govern memory locally. +- Memory becomes behavior: repeated failures become reflexes, warnings, procedures, rules, and project-specific habits. + +The small-business angle fits this: websites, CRMs, support bots, ops assistants, and local AI automations all need durable memory without giving every customer workflow to a hosted memory vendor. + +The category is not "Claude remembers." The category is "every agent you run gets a durable local brain and checks it before acting." diff --git a/docs/mcp-hosts.md b/docs/mcp-hosts.md index c3f1f53..1cdfbb1 100644 --- a/docs/mcp-hosts.md +++ b/docs/mcp-hosts.md @@ -1,6 +1,16 @@ # Audrey MCP Host Guide -Audrey ships as a local stdio MCP server, so the simplest cross-host setup is to launch it with `npx`. +Audrey ships as a local stdio MCP server. Claude Code is only one host; the same server is meant to be used from Codex, Claude Desktop, Cursor, Windsurf, VS Code, JetBrains, and any MCP-compatible local agent shell. + +For pinned configs that launch the built Audrey entrypoint directly: + +```bash +npx audrey mcp-config codex +npx audrey mcp-config generic +npx audrey mcp-config vscode +``` + +For portable configs that always resolve the latest published package, launch with `npx`: ```json { @@ -29,6 +39,62 @@ If a Windows host fails to locate `npx`, use: } ``` +## Codex + +Codex uses TOML under `C:\Users\\.codex\config.toml` on Windows. + +Generate a pinned block: + +```bash +npx audrey mcp-config codex +``` + +Example shape: + +```toml +[mcp_servers.audrey-memory] +command = "C:\\Program Files\\nodejs\\node.exe" +args = ["C:\\Users\\you\\AppData\\Roaming\\npm\\node_modules\\audrey\\dist\\mcp-server\\index.js"] + +[mcp_servers.audrey-memory.env] +AUDREY_AGENT = "codex" +AUDREY_DATA_DIR = "C:\\Users\\you\\.audrey\\data" +AUDREY_EMBEDDING_PROVIDER = "local" +AUDREY_DEVICE = "gpu" +``` + +Use one shared `AUDREY_DATA_DIR` if Codex and other hosts should remember the same work. Use separate data directories if you need hard separation between clients or projects. + +## Claude Code + +Claude Code can use Audrey through the built-in installer: + +```bash +npx audrey install +claude mcp list +``` + +The installer persists a Claude Code `AUDREY_AGENT=claude-code` identity while still using the same Audrey MCP runtime as every other host. + +## Claude Desktop + +Claude Desktop uses `claude_desktop_config.json`. + +```json +{ + "mcpServers": { + "audrey-memory": { + "type": "stdio", + "command": "npx", + "args": ["-y", "audrey"], + "env": { + "AUDREY_AGENT": "claude-desktop" + } + } + } +} +``` + ## Cursor Official docs: @@ -128,6 +194,6 @@ Example JSON: Once connected, hosts can use: -- Tools: the 13 `memory_*` Audrey tools +- Tools: the 19 `memory_*` Audrey tools, including `memory_preflight` and `memory_reflexes` - Resources: `audrey://status`, `audrey://recent`, `audrey://principles` - Prompts: `audrey-session-briefing`, `audrey-memory-recall`, `audrey-memory-reflection` diff --git a/docs/ollama-local-agents.md b/docs/ollama-local-agents.md new file mode 100644 index 0000000..642390d --- /dev/null +++ b/docs/ollama-local-agents.md @@ -0,0 +1,128 @@ +# Audrey With Ollama Local Agents + +Ollama provides local model inference. Audrey provides long-term memory. Treat Audrey as the memory sidecar that your Ollama-backed agent calls through tools. + +This is intentionally host-neutral: the same Audrey data directory can be shared by Codex, Claude Code, Claude Desktop, and a local Ollama agent, or isolated per project. + +## Start Audrey + +```bash +AUDREY_AGENT=ollama-local-agent AUDREY_EMBEDDING_PROVIDER=local npx audrey serve +``` + +Health check: + +```bash +curl http://localhost:7437/health +curl http://localhost:7437/v1/status +``` + +Use `AUDREY_API_KEY` if the sidecar is reachable beyond your local process boundary: + +```bash +AUDREY_API_KEY=secret AUDREY_AGENT=ollama-local-agent npx audrey serve +``` + +## Memory Tools To Expose + +Expose these Audrey routes as function tools in your local agent loop: + +| Tool | Audrey route | Purpose | +|---|---|---| +| `memory_preflight` | `POST /v1/preflight` | Check known risks, rules, procedures, and prior failures before tool use | +| `memory_reflexes` | `POST /v1/reflexes` | Convert preflight evidence into trigger-response rules the agent can automate | +| `memory_capsule` | `POST /v1/capsule` | Build a compact, ranked context packet for the current task | +| `memory_recall` | `POST /v1/recall` | Search durable memories | +| `memory_encode` | `POST /v1/encode` | Store useful observations, decisions, procedures, and preferences | +| `memory_status` | `GET /v1/status` | Check memory/index health | + +Minimum useful loop: + +1. Before tool use, call `memory_reflexes` or `memory_preflight` for the proposed action. +2. If a reflex says `block`, stop and ask for repair or approval. +3. Before calling Ollama, ask Audrey for a capsule using the user task as the query. +4. Add the capsule to the model instructions or context. +5. Let the model call `memory_recall` for details when needed. +6. After the task, call `memory_encode` for durable facts, decisions, mistakes, procedures, and preferences. +7. Run `npx audrey dream` on a schedule to consolidate and decay memory. + +## Native Ollama Tool Shape + +Ollama supports function tools on `/api/chat`. Your agent owns the loop that executes a tool call and sends the result back to the model. + +Audrey ships a complete example loop: + +```bash +OLLAMA_MODEL=qwen3 node examples/ollama-memory-agent.js "What should you remember about this project?" +``` + +```json +{ + "type": "function", + "function": { + "name": "memory_recall", + "description": "Recall Audrey memories relevant to a query.", + "parameters": { + "type": "object", + "required": ["query"], + "properties": { + "query": { + "type": "string", + "description": "Search query for durable memory." + }, + "limit": { + "type": "number", + "description": "Maximum results to return." + } + } + } + } +} +``` + +Tool executor: + +```js +export async function memoryRecall({ query, limit = 5 }) { + const response = await fetch('http://localhost:7437/v1/recall', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ query, limit }), + }); + if (!response.ok) { + throw new Error(`Audrey recall failed: ${response.status}`); + } + return response.json(); +} +``` + +## OpenAI-Compatible Ollama Mode + +Ollama also exposes an OpenAI-compatible API at `http://localhost:11434/v1/`. If your local agent framework already knows how to call OpenAI-style tools, point the model client at Ollama and keep Audrey as the tool executor. + +The important separation is: + +- Ollama answers with local models. +- Audrey remembers, recalls, reconciles, and consolidates. +- The agent loop decides when a model tool call should hit Audrey. + +Official Ollama references: + +- Native tool calling: +- OpenAI-compatible API: + +## Data Layout + +For shared memory across hosts: + +```bash +AUDREY_DATA_DIR=$HOME/.audrey/data +``` + +For project-local memory: + +```bash +AUDREY_DATA_DIR=.audrey-data +``` + +Shared memory is better for personal continuity across Codex, Claude, and local agents. Project-local memory is better when clients, repositories, or experiments must not bleed into each other. diff --git a/docs/production-readiness.md b/docs/production-readiness.md index 5f9e9b7..47c18ce 100644 --- a/docs/production-readiness.md +++ b/docs/production-readiness.md @@ -2,7 +2,7 @@ Audrey is ready to be the memory layer inside a production agent system, but it is not a complete regulated-platform package by itself. Treat it as stateful infrastructure: pin providers, isolate tenants, monitor health, and wrap it with the controls your environment requires. -First contact should now go through `npx audrey init sidecar-prod` for the sidecar path or `npx audrey init` for the default Claude Code path, then `npx audrey doctor` before exposing Audrey to real traffic. +First contact should now go through `npx audrey mcp-config ` for local MCP hosts, `npx audrey install` for Claude Code specifically, or `npx audrey serve` for the sidecar path. Run `npx audrey status --json --fail-on-unhealthy` before exposing Audrey to real traffic. ## Best Vertical Fit @@ -102,7 +102,6 @@ That keeps Audrey focused on memory integrity while the host system owns complia Audrey now ships with a first-party container path for the REST API: ```bash -npx audrey init sidecar-prod docker compose up -d --build ``` diff --git a/examples/ollama-memory-agent.js b/examples/ollama-memory-agent.js new file mode 100644 index 0000000..d412d20 --- /dev/null +++ b/examples/ollama-memory-agent.js @@ -0,0 +1,326 @@ +#!/usr/bin/env node + +const AUDREY_URL = (process.env.AUDREY_URL || 'http://127.0.0.1:7437').replace(/\/$/, ''); +const OLLAMA_URL = (process.env.OLLAMA_URL || 'http://127.0.0.1:11434').replace(/\/$/, ''); +const OLLAMA_MODEL = process.env.OLLAMA_MODEL || 'qwen3'; +const AUDREY_API_KEY = process.env.AUDREY_API_KEY || ''; +const MAX_TOOL_LOOPS = Number.parseInt(process.env.MAX_TOOL_LOOPS || '4', 10); + +const userPrompt = process.argv.slice(2).join(' ').trim() + || 'Use Audrey memory to explain how this local Ollama agent should remember useful facts.'; + +function usage() { + console.log(` +Audrey + Ollama local memory agent + +Prerequisites: + 1. Start Audrey: AUDREY_AGENT=ollama-local-agent npx audrey serve + 2. Start Ollama and pull a tool-capable model: ollama pull qwen3 + +Run: + OLLAMA_MODEL=qwen3 node examples/ollama-memory-agent.js "What should you remember about this project?" + +Environment: + AUDREY_URL=http://127.0.0.1:7437 + AUDREY_API_KEY=secret + OLLAMA_URL=http://127.0.0.1:11434 + OLLAMA_MODEL=qwen3 +`); +} + +function headers() { + const h = { 'Content-Type': 'application/json' }; + if (AUDREY_API_KEY) h.Authorization = `Bearer ${AUDREY_API_KEY}`; + return h; +} + +async function jsonFetch(url, options = {}) { + const response = await fetch(url, options); + const text = await response.text(); + let data = null; + if (text.trim()) { + try { + data = JSON.parse(text); + } catch { + data = { raw: text }; + } + } + if (!response.ok) { + const detail = data?.error || data?.message || text || response.statusText; + throw new Error(`${response.status} ${response.statusText}: ${detail}`); + } + return data; +} + +async function audreyGet(path) { + return jsonFetch(`${AUDREY_URL}${path}`, { headers: headers() }); +} + +async function audreyPost(path, body) { + return jsonFetch(`${AUDREY_URL}${path}`, { + method: 'POST', + headers: headers(), + body: JSON.stringify(body), + }); +} + +async function memoryRecall({ query, limit = 5 }) { + if (!query || typeof query !== 'string') { + throw new Error('memory_recall requires a string query'); + } + return audreyPost('/v1/recall', { query, limit }); +} + +async function memoryCapsule({ query, budget_chars = 4000 }) { + if (!query || typeof query !== 'string') { + throw new Error('memory_capsule requires a string query'); + } + return audreyPost('/v1/capsule', { query, budget_chars }); +} + +async function memoryPreflight({ action, tool, strict = false, include_capsule = false }) { + if (!action || typeof action !== 'string') { + throw new Error('memory_preflight requires a string action'); + } + return audreyPost('/v1/preflight', { action, tool, strict, include_capsule }); +} + +async function memoryReflexes({ action, tool, strict = false, include_preflight = false }) { + if (!action || typeof action !== 'string') { + throw new Error('memory_reflexes requires a string action'); + } + return audreyPost('/v1/reflexes', { action, tool, strict, include_preflight }); +} + +async function memoryEncode({ content, source = 'model-generated', tags = ['ollama-agent'] }) { + if (!content || typeof content !== 'string') { + throw new Error('memory_encode requires string content'); + } + return audreyPost('/v1/encode', { content, source, tags }); +} + +const toolExecutors = { + memory_preflight: memoryPreflight, + memory_reflexes: memoryReflexes, + memory_recall: memoryRecall, + memory_capsule: memoryCapsule, + memory_encode: memoryEncode, +}; + +const tools = [ + { + type: 'function', + function: { + name: 'memory_preflight', + description: 'Check Audrey memory before taking an action, so prior failures and rules are not repeated.', + parameters: { + type: 'object', + required: ['action'], + properties: { + action: { type: 'string', description: 'Action the agent is considering.' }, + tool: { type: 'string', description: 'Optional tool or command family.' }, + strict: { type: 'boolean', description: 'If true, high-severity warnings can block the action.' }, + include_capsule: { type: 'boolean', description: 'Include full capsule context in the result.' }, + }, + }, + }, + }, + { + type: 'function', + function: { + name: 'memory_reflexes', + description: 'Return Audrey Memory Reflexes: trigger-response rules for the action the agent is considering.', + parameters: { + type: 'object', + required: ['action'], + properties: { + action: { type: 'string', description: 'Action the agent is considering.' }, + tool: { type: 'string', description: 'Optional tool or command family.' }, + strict: { type: 'boolean', description: 'If true, high-severity warnings can become blocking reflexes.' }, + include_preflight: { type: 'boolean', description: 'Include the full underlying preflight report.' }, + }, + }, + }, + }, + { + type: 'function', + function: { + name: 'memory_recall', + description: 'Recall durable Audrey memories relevant to a query.', + parameters: { + type: 'object', + required: ['query'], + properties: { + query: { type: 'string', description: 'Search query for Audrey memory.' }, + limit: { type: 'number', description: 'Maximum memories to return.' }, + }, + }, + }, + }, + { + type: 'function', + function: { + name: 'memory_capsule', + description: 'Build a compact, evidence-backed Audrey Memory Capsule for the current task.', + parameters: { + type: 'object', + required: ['query'], + properties: { + query: { type: 'string', description: 'Current task or question.' }, + budget_chars: { type: 'number', description: 'Maximum capsule size in characters.' }, + }, + }, + }, + }, + { + type: 'function', + function: { + name: 'memory_encode', + description: 'Store a useful lasting observation, decision, preference, or procedure in Audrey.', + parameters: { + type: 'object', + required: ['content'], + properties: { + content: { type: 'string', description: 'Memory content to store.' }, + source: { + type: 'string', + enum: ['direct-observation', 'told-by-user', 'tool-result', 'inference', 'model-generated'], + description: 'Source reliability category.', + }, + tags: { + type: 'array', + items: { type: 'string' }, + description: 'Searchable tags for this memory.', + }, + }, + }, + }, + }, +]; + +function parseToolArguments(args) { + if (args == null) return {}; + if (typeof args === 'string') { + try { + return JSON.parse(args); + } catch { + return { raw: args }; + } + } + return args; +} + +async function ollamaChat(messages) { + return jsonFetch(`${OLLAMA_URL}/api/chat`, { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + model: OLLAMA_MODEL, + stream: false, + messages, + tools, + }), + }); +} + +async function main() { + if (process.argv.includes('--help') || process.argv.includes('-h')) { + usage(); + return; + } + + try { + await audreyGet('/health'); + } catch (err) { + console.error(`Audrey is not reachable at ${AUDREY_URL}.`); + console.error('Start it with: AUDREY_AGENT=ollama-local-agent npx audrey serve'); + console.error(`Details: ${err.message}`); + process.exit(1); + } + + const reflexes = await memoryReflexes({ action: userPrompt, include_preflight: false }); + const preflight = await memoryPreflight({ action: userPrompt, include_capsule: false }); + const capsule = await memoryCapsule({ query: userPrompt, budget_chars: 4000 }); + const messages = [ + { + role: 'system', + content: [ + 'You are a local Ollama agent with Audrey long-term memory.', + 'Use Audrey tools when memory would improve the answer.', + 'Before taking risky tool actions, call memory_reflexes or memory_preflight and follow any warnings.', + 'Store only durable preferences, facts, decisions, procedures, and useful lessons.', + '', + 'Initial Audrey Memory Reflexes:', + JSON.stringify(reflexes, null, 2).slice(0, 3000), + '', + 'Initial Audrey Preflight:', + JSON.stringify(preflight, null, 2).slice(0, 3000), + '', + 'Initial Audrey Memory Capsule:', + JSON.stringify(capsule, null, 2).slice(0, 6000), + ].join('\n'), + }, + { role: 'user', content: userPrompt }, + ]; + + console.error(`[audrey-ollama] model=${OLLAMA_MODEL} audrey=${AUDREY_URL} ollama=${OLLAMA_URL}`); + + for (let i = 0; i < MAX_TOOL_LOOPS; i += 1) { + let response; + try { + response = await ollamaChat(messages); + } catch (err) { + console.error(`Ollama is not reachable at ${OLLAMA_URL}, or model "${OLLAMA_MODEL}" is not available.`); + console.error(`Try: ollama pull ${OLLAMA_MODEL}`); + console.error(`Details: ${err.message}`); + process.exit(1); + } + + const message = response.message || {}; + messages.push(message); + + const calls = message.tool_calls || []; + if (calls.length === 0) { + console.log(message.content || '(model returned no content)'); + await memoryEncode({ + content: `Ollama agent answered: ${userPrompt.slice(0, 240)}`, + source: 'model-generated', + tags: ['ollama-agent', 'session-summary'], + }).catch(() => undefined); + return; + } + + for (const call of calls) { + const name = call.function?.name; + const executor = toolExecutors[name]; + if (!executor) { + messages.push({ role: 'tool', tool_name: name || 'unknown', content: 'Unknown Audrey tool' }); + continue; + } + + const args = parseToolArguments(call.function?.arguments); + console.error(`[audrey-ollama] tool ${name} ${JSON.stringify(args)}`); + try { + const result = await executor(args); + messages.push({ + role: 'tool', + tool_name: name, + content: JSON.stringify(result).slice(0, 8000), + }); + } catch (err) { + messages.push({ + role: 'tool', + tool_name: name, + content: `Audrey tool error: ${err.message}`, + }); + } + } + } + + console.log('Stopped after MAX_TOOL_LOOPS without a final model answer.'); +} + +main().catch((err) => { + console.error(err); + process.exit(1); +}); diff --git a/mcp-server/config.ts b/mcp-server/config.ts index 85e3da6..1652b6e 100644 --- a/mcp-server/config.ts +++ b/mcp-server/config.ts @@ -1,144 +1,248 @@ -import { homedir } from 'node:os'; -import { join } from 'node:path'; -import { fileURLToPath } from 'node:url'; -import type { AudreyConfig, EmbeddingConfig, LLMConfig } from '../src/types.js'; - +import { homedir } from 'node:os'; +import { join } from 'node:path'; +import { fileURLToPath } from 'node:url'; +import type { AudreyConfig, EmbeddingConfig, LLMConfig } from '../src/types.js'; + export const VERSION = '0.20.0'; export const SERVER_NAME = 'audrey-memory'; +export const DEFAULT_AGENT = 'local-agent'; export const DEFAULT_DATA_DIR = join(homedir(), '.audrey', 'data'); export const MCP_ENTRYPOINT = fileURLToPath(new URL('./index.js', import.meta.url)); -const VALID_EMBEDDING_PROVIDERS = new Set(['mock', 'local', 'gemini', 'openai']); -const VALID_LLM_PROVIDERS = new Set(['mock', 'anthropic', 'openai']); - -function assertValidProvider(provider: string, validProviders: Set, envVar: string): void { - if (!validProviders.has(provider)) { - throw new Error(`Unsupported ${envVar} value: ${provider}`); - } -} - -function defaultEmbeddingDimensions(provider: string): number { - switch (provider) { - case 'mock': - return 64; - case 'openai': - return 1536; - case 'gemini': - return 3072; - case 'local': - default: - return 384; - } -} - -export function resolveDataDir(env: Record = process.env): string { - return env['AUDREY_DATA_DIR'] || DEFAULT_DATA_DIR; -} - -/** - * Resolves which embedding provider to use. - * Priority: explicit config -> gemini (if GOOGLE_API_KEY exists) -> local - * OpenAI is NEVER auto-selected -- must be set explicitly via AUDREY_EMBEDDING_PROVIDER=openai. - */ -export function resolveEmbeddingProvider( - env: Record, - explicit: string | undefined = env['AUDREY_EMBEDDING_PROVIDER'], -): EmbeddingConfig & { dimensions: number } { - if (explicit && explicit !== 'auto') { - assertValidProvider(explicit, VALID_EMBEDDING_PROVIDERS, 'AUDREY_EMBEDDING_PROVIDER'); - const provider = explicit as EmbeddingConfig['provider']; - const dims = defaultEmbeddingDimensions(explicit); - const apiKey = explicit === 'gemini' - ? (env['GOOGLE_API_KEY'] || env['GEMINI_API_KEY']) - : explicit === 'openai' - ? env['OPENAI_API_KEY'] - : undefined; - const result: EmbeddingConfig & { dimensions: number } = { provider, apiKey, dimensions: dims }; - if (explicit === 'local') result.device = env['AUDREY_DEVICE'] || 'gpu'; - return result; - } - if (env['GOOGLE_API_KEY'] || env['GEMINI_API_KEY']) { - return { provider: 'gemini', apiKey: env['GOOGLE_API_KEY'] || env['GEMINI_API_KEY'], dimensions: 3072 }; - } - return { provider: 'local', dimensions: 384, device: env['AUDREY_DEVICE'] || 'gpu' }; -} - -export function resolveLLMProvider( - env: Record, - explicit: string | undefined = env['AUDREY_LLM_PROVIDER'], -): (LLMConfig & { apiKey?: string }) | null { - if (explicit && explicit !== 'auto') { - assertValidProvider(explicit, VALID_LLM_PROVIDERS, 'AUDREY_LLM_PROVIDER'); - const provider = explicit as LLMConfig['provider']; - if (provider === 'anthropic') { - return { provider: 'anthropic', apiKey: env['ANTHROPIC_API_KEY'] }; - } - if (provider === 'openai') { - return { provider: 'openai', apiKey: env['OPENAI_API_KEY'] }; - } - return { provider: 'mock' }; - } - - if (env['ANTHROPIC_API_KEY']) { - return { provider: 'anthropic', apiKey: env['ANTHROPIC_API_KEY'] }; - } - if (env['OPENAI_API_KEY']) { - return { provider: 'openai', apiKey: env['OPENAI_API_KEY'] }; - } - return null; +export const HOST_AGENT_NAMES = { + generic: DEFAULT_AGENT, + codex: 'codex', + 'claude-code': 'claude-code', + 'claude-desktop': 'claude-desktop', + cursor: 'cursor', + windsurf: 'windsurf', + vscode: 'vscode-copilot', + jetbrains: 'jetbrains', +} as const; + +export type AudreyHost = keyof typeof HOST_AGENT_NAMES; + +interface McpEnvOptions { + includeSecrets?: boolean; } - + +const VALID_EMBEDDING_PROVIDERS = new Set(['mock', 'local', 'gemini', 'openai']); +const VALID_LLM_PROVIDERS = new Set(['mock', 'anthropic', 'openai']); + +function assertValidProvider(provider: string, validProviders: Set, envVar: string): void { + if (!validProviders.has(provider)) { + throw new Error(`Unsupported ${envVar} value: ${provider}`); + } +} + +function defaultEmbeddingDimensions(provider: string): number { + switch (provider) { + case 'mock': + return 64; + case 'openai': + return 1536; + case 'gemini': + return 3072; + case 'local': + default: + return 384; + } +} + +export function resolveDataDir(env: Record = process.env): string { + return env['AUDREY_DATA_DIR'] || DEFAULT_DATA_DIR; +} + +/** + * Resolves which embedding provider to use. + * Priority: explicit config -> gemini (if GOOGLE_API_KEY exists) -> local + * OpenAI is NEVER auto-selected -- must be set explicitly via AUDREY_EMBEDDING_PROVIDER=openai. + */ +export function resolveEmbeddingProvider( + env: Record, + explicit: string | undefined = env['AUDREY_EMBEDDING_PROVIDER'], +): EmbeddingConfig & { dimensions: number } { + if (explicit && explicit !== 'auto') { + assertValidProvider(explicit, VALID_EMBEDDING_PROVIDERS, 'AUDREY_EMBEDDING_PROVIDER'); + const provider = explicit as EmbeddingConfig['provider']; + const dims = defaultEmbeddingDimensions(explicit); + const apiKey = explicit === 'gemini' + ? (env['GOOGLE_API_KEY'] || env['GEMINI_API_KEY']) + : explicit === 'openai' + ? env['OPENAI_API_KEY'] + : undefined; + const result: EmbeddingConfig & { dimensions: number } = { provider, apiKey, dimensions: dims }; + if (explicit === 'local') result.device = env['AUDREY_DEVICE'] || 'gpu'; + return result; + } + if (env['GOOGLE_API_KEY'] || env['GEMINI_API_KEY']) { + return { provider: 'gemini', apiKey: env['GOOGLE_API_KEY'] || env['GEMINI_API_KEY'], dimensions: 3072 }; + } + return { provider: 'local', dimensions: 384, device: env['AUDREY_DEVICE'] || 'gpu' }; +} + +export function resolveLLMProvider( + env: Record, + explicit: string | undefined = env['AUDREY_LLM_PROVIDER'], +): (LLMConfig & { apiKey?: string }) | null { + if (explicit && explicit !== 'auto') { + assertValidProvider(explicit, VALID_LLM_PROVIDERS, 'AUDREY_LLM_PROVIDER'); + const provider = explicit as LLMConfig['provider']; + if (provider === 'anthropic') { + return { provider: 'anthropic', apiKey: env['ANTHROPIC_API_KEY'] }; + } + if (provider === 'openai') { + return { provider: 'openai', apiKey: env['OPENAI_API_KEY'] }; + } + return { provider: 'mock' }; + } + + if (env['ANTHROPIC_API_KEY']) { + return { provider: 'anthropic', apiKey: env['ANTHROPIC_API_KEY'] }; + } + if (env['OPENAI_API_KEY']) { + return { provider: 'openai', apiKey: env['OPENAI_API_KEY'] }; + } + return null; +} + export function buildAudreyConfig(): AudreyConfig { const dataDir = resolveDataDir(process.env); - const agent = process.env['AUDREY_AGENT'] || 'claude-code'; + const agent = process.env['AUDREY_AGENT'] || DEFAULT_AGENT; const explicitProvider = process.env['AUDREY_EMBEDDING_PROVIDER']; - - const embedding = resolveEmbeddingProvider(process.env, explicitProvider); - const llm = resolveLLMProvider(process.env, process.env['AUDREY_LLM_PROVIDER']); - - const config: AudreyConfig = { dataDir, agent, embedding }; - if (llm) { - // LLMConfig requires provider as literal union; resolveLLMProvider guarantees this - config.llm = llm as AudreyConfig['llm']; - } - + + const embedding = resolveEmbeddingProvider(process.env, explicitProvider); + const llm = resolveLLMProvider(process.env, process.env['AUDREY_LLM_PROVIDER']); + + const config: AudreyConfig = { dataDir, agent, embedding }; + if (llm) { + // LLMConfig requires provider as literal union; resolveLLMProvider guarantees this + config.llm = llm as AudreyConfig['llm']; + } + return config; } -export function buildInstallArgs(env: Record = process.env): string[] { +export function resolveHostAgent(host: string | undefined): string { + if (!host) return HOST_AGENT_NAMES.generic; + if (host in HOST_AGENT_NAMES) return HOST_AGENT_NAMES[host as AudreyHost]; + throw new Error(`Unsupported MCP host "${host}". Supported hosts: ${Object.keys(HOST_AGENT_NAMES).join(', ')}`); +} + +export function buildAudreyMcpEnv( + env: Record = process.env, + agent = env['AUDREY_AGENT'] || DEFAULT_AGENT, + options: McpEnvOptions = {}, +): Record { + const includeSecrets = options.includeSecrets ?? true; + const providerEnv = includeSecrets + ? env + : { + ...env, + ANTHROPIC_API_KEY: undefined, + GOOGLE_API_KEY: undefined, + GEMINI_API_KEY: undefined, + OPENAI_API_KEY: undefined, + }; const envPairs = new Map(); const addEnv = (key: string, value: string | undefined | null): void => { if (value === undefined || value === null || value === '') return; - envPairs.set(key, `${key}=${value}`); + envPairs.set(key, value); }; addEnv('AUDREY_DATA_DIR', resolveDataDir(env)); + addEnv('AUDREY_AGENT', agent); - const embedding = resolveEmbeddingProvider(env, env['AUDREY_EMBEDDING_PROVIDER']); + const embedding = resolveEmbeddingProvider(providerEnv, env['AUDREY_EMBEDDING_PROVIDER']); addEnv('AUDREY_EMBEDDING_PROVIDER', embedding.provider); if (embedding.provider === 'local') { addEnv('AUDREY_DEVICE', embedding.device || env['AUDREY_DEVICE'] || 'gpu'); } else if (embedding.provider === 'gemini') { - addEnv('GOOGLE_API_KEY', embedding.apiKey); + if (includeSecrets) addEnv('GOOGLE_API_KEY', embedding.apiKey); } else if (embedding.provider === 'openai') { - addEnv('OPENAI_API_KEY', embedding.apiKey); + if (includeSecrets) addEnv('OPENAI_API_KEY', embedding.apiKey); } - const llm = resolveLLMProvider(env, env['AUDREY_LLM_PROVIDER']); + const llm = resolveLLMProvider(providerEnv, env['AUDREY_LLM_PROVIDER']); if (llm) { addEnv('AUDREY_LLM_PROVIDER', llm.provider); if (llm.provider === 'anthropic') { - addEnv('ANTHROPIC_API_KEY', llm.apiKey); + if (includeSecrets) addEnv('ANTHROPIC_API_KEY', llm.apiKey); } else if (llm.provider === 'openai') { - addEnv('OPENAI_API_KEY', llm.apiKey); + if (includeSecrets) addEnv('OPENAI_API_KEY', llm.apiKey); } } + return Object.fromEntries(envPairs); +} + +export function buildStdioMcpServerConfig( + env: Record = process.env, + host: string | undefined = 'generic', +): { command: string; args: string[]; env: Record } { + const agent = env['AUDREY_AGENT'] || resolveHostAgent(host); + return { + command: process.execPath, + args: [MCP_ENTRYPOINT], + env: buildAudreyMcpEnv(env, agent, { includeSecrets: false }), + }; +} + +function jsonHostConfig(host: string | undefined, env: Record): unknown { + const config = buildStdioMcpServerConfig(env, host); + if (host === 'vscode') { + return { + servers: { + [SERVER_NAME]: { + type: 'stdio', + ...config, + }, + }, + }; + } + + return { + mcpServers: { + [SERVER_NAME]: { + type: 'stdio', + ...config, + }, + }, + }; +} + +function tomlString(value: string): string { + return `"${value.replace(/\\/g, '\\\\').replace(/"/g, '\\"')}"`; +} + +export function formatMcpHostConfig( + host: string | undefined = 'generic', + env: Record = process.env, +): string { + const normalizedHost = host || 'generic'; + if (normalizedHost === 'codex') { + const config = buildStdioMcpServerConfig(env, normalizedHost); + const lines = [ + `[mcp_servers.${SERVER_NAME}]`, + `command = ${tomlString(config.command)}`, + `args = [${config.args.map(tomlString).join(', ')}]`, + '', + `[mcp_servers.${SERVER_NAME}.env]`, + ...Object.entries(config.env).map(([key, value]) => `${key} = ${tomlString(value)}`), + ]; + return lines.join('\n'); + } + + return JSON.stringify(jsonHostConfig(normalizedHost, env), null, 2); +} + +export function buildInstallArgs(env: Record = process.env): string[] { + const envPairs = buildAudreyMcpEnv(env, env['AUDREY_AGENT'] || HOST_AGENT_NAMES['claude-code'], { includeSecrets: true }); const args = ['mcp', 'add', '-s', 'user', SERVER_NAME]; - for (const pair of envPairs.values()) { - args.push('-e', pair); + for (const [key, value] of Object.entries(envPairs)) { + args.push('-e', `${key}=${value}`); } args.push('--', process.execPath, MCP_ENTRYPOINT); - return args; -} + return args; +} diff --git a/mcp-server/index.ts b/mcp-server/index.ts index 3644f2b..5f97edc 100644 --- a/mcp-server/index.ts +++ b/mcp-server/index.ts @@ -1,113 +1,116 @@ -#!/usr/bin/env node +#!/usr/bin/env node import { z } from 'zod'; -import { homedir } from 'node:os'; +import { homedir, tmpdir } from 'node:os'; import { join, resolve } from 'node:path'; -import { existsSync, readFileSync } from 'node:fs'; -import { execFileSync } from 'node:child_process'; -import { fileURLToPath } from 'node:url'; -import { Audrey } from '../src/index.js'; -import { readStoredDimensions } from '../src/db.js'; -import type { AudreyConfig, EmbeddingProvider, IntrospectResult, MemoryStatusResult } from '../src/types.js'; -import { - VERSION, +import { existsSync, mkdirSync, mkdtempSync, readFileSync, rmSync } from 'node:fs'; +import { execFileSync } from 'node:child_process'; +import { fileURLToPath } from 'node:url'; +import { Audrey } from '../src/index.js'; +import { readStoredDimensions } from '../src/db.js'; +import type { AudreyConfig, EmbeddingProvider, IntrospectResult, MemoryStatusResult } from '../src/types.js'; +import { + VERSION, SERVER_NAME, buildAudreyConfig, buildInstallArgs, + formatMcpHostConfig, resolveDataDir, resolveEmbeddingProvider, resolveLLMProvider, } from './config.js'; - -const VALID_SOURCES = { - 'direct-observation': 'direct-observation', - 'told-by-user': 'told-by-user', - 'tool-result': 'tool-result', - 'inference': 'inference', - 'model-generated': 'model-generated', -} as const; - -const VALID_TYPES = { - 'episodic': 'episodic', - 'semantic': 'semantic', - 'procedural': 'procedural', -} as const; - -export const MAX_MEMORY_CONTENT_LENGTH = 50_000; - -const subcommand = process.argv[2]; - -function isNonEmptyText(value: unknown): boolean { - return typeof value === 'string' && value.trim().length > 0; -} - -export function validateMemoryContent(content: string): void { - if (!isNonEmptyText(content)) { - throw new Error('content must be a non-empty string'); - } - if (content.length > MAX_MEMORY_CONTENT_LENGTH) { - throw new Error(`content exceeds maximum length of ${MAX_MEMORY_CONTENT_LENGTH} characters`); - } -} - -export function validateForgetSelection(id?: string, query?: string): void { - if ((id && query) || (!id && !query)) { - throw new Error('Provide exactly one of id or query'); - } -} - -export async function initializeEmbeddingProvider(provider: EmbeddingProvider): Promise { - if (provider && typeof provider.ready === 'function') { - await provider.ready(); - } -} - -export const memoryEncodeToolSchema = { - content: z.string() - .max(MAX_MEMORY_CONTENT_LENGTH) - .refine(isNonEmptyText, 'Content must not be empty') - .describe('The memory content to encode'), - source: z.enum(VALID_SOURCES).describe('Source type of the memory'), - tags: z.array(z.string()).optional().describe('Optional tags for categorization'), - salience: z.number().min(0).max(1).optional().describe('Importance weight 0-1'), - context: z.record(z.string(), z.string()).optional().describe('Situational context as key-value pairs (e.g., {task: "debugging", domain: "payments"})'), - affect: z.object({ - valence: z.number().min(-1).max(1).describe('Emotional valence: -1 (very negative) to 1 (very positive)'), - arousal: z.number().min(0).max(1).optional().describe('Emotional arousal: 0 (calm) to 1 (highly activated)'), - label: z.string().optional().describe('Human-readable emotion label (e.g., "curiosity", "frustration", "relief")'), - }).optional().describe('Emotional affect - how this memory feels'), - private: z.boolean().optional().describe('If true, memory is only visible to the AI and excluded from public recall results'), -}; - -export const memoryRecallToolSchema = { - query: z.string().describe('Search query to match against memories'), - limit: z.number().min(1).max(50).optional().describe('Max results (default 10)'), - types: z.array(z.enum(VALID_TYPES)).optional().describe('Memory types to search'), - min_confidence: z.number().min(0).max(1).optional().describe('Minimum confidence threshold'), - tags: z.array(z.string()).optional().describe('Only return episodic memories with these tags'), - sources: z.array(z.enum(VALID_SOURCES)).optional().describe('Only return episodic memories from these sources'), - after: z.string().optional().describe('Only return memories created after this ISO date'), - before: z.string().optional().describe('Only return memories created before this ISO date'), - context: z.record(z.string(), z.string()).optional().describe('Retrieval context - memories encoded in matching context get boosted'), - mood: z.object({ - valence: z.number().min(-1).max(1).describe('Current emotional valence: -1 (negative) to 1 (positive)'), - arousal: z.number().min(0).max(1).optional().describe('Current arousal: 0 (calm) to 1 (activated)'), - }).optional().describe('Current mood - boosts recall of memories encoded in similar emotional state'), -}; - -export const memoryImportToolSchema = { - snapshot: z.object({ - version: z.string(), - episodes: z.array(z.any()), - semantics: z.array(z.any()).optional(), - procedures: z.array(z.any()).optional(), - causalLinks: z.array(z.any()).optional(), - contradictions: z.array(z.any()).optional(), - consolidationRuns: z.array(z.any()).optional(), - consolidationMetrics: z.array(z.any()).optional(), - config: z.record(z.string(), z.string()).optional(), - }).passthrough().describe('A snapshot from memory_export'), -}; - + +const VALID_SOURCES = { + 'direct-observation': 'direct-observation', + 'told-by-user': 'told-by-user', + 'tool-result': 'tool-result', + 'inference': 'inference', + 'model-generated': 'model-generated', +} as const; + +const VALID_TYPES = { + 'episodic': 'episodic', + 'semantic': 'semantic', + 'procedural': 'procedural', +} as const; + +export const MAX_MEMORY_CONTENT_LENGTH = 50_000; + +const subcommand = process.argv[2]; + +function isNonEmptyText(value: unknown): boolean { + return typeof value === 'string' && value.trim().length > 0; +} + +export function validateMemoryContent(content: string): void { + if (!isNonEmptyText(content)) { + throw new Error('content must be a non-empty string'); + } + if (content.length > MAX_MEMORY_CONTENT_LENGTH) { + throw new Error(`content exceeds maximum length of ${MAX_MEMORY_CONTENT_LENGTH} characters`); + } +} + +export function validateForgetSelection(id?: string, query?: string): void { + if ((id && query) || (!id && !query)) { + throw new Error('Provide exactly one of id or query'); + } +} + +export async function initializeEmbeddingProvider(provider: EmbeddingProvider): Promise { + if (provider && typeof provider.ready === 'function') { + await provider.ready(); + } +} + +export const memoryEncodeToolSchema = { + content: z.string() + .max(MAX_MEMORY_CONTENT_LENGTH) + .refine(isNonEmptyText, 'Content must not be empty') + .describe('The memory content to encode'), + source: z.enum(VALID_SOURCES).describe('Source type of the memory'), + tags: z.array(z.string()).optional().describe('Optional tags for categorization'), + salience: z.number().min(0).max(1).optional().describe('Importance weight 0-1'), + context: z.record(z.string(), z.string()).optional().describe( + 'Situational context as key-value pairs (e.g., {task: "debugging", domain: "payments"})' + ), + affect: z.object({ + valence: z.number().min(-1).max(1).describe('Emotional valence: -1 (very negative) to 1 (very positive)'), + arousal: z.number().min(0).max(1).optional().describe('Emotional arousal: 0 (calm) to 1 (highly activated)'), + label: z.string().optional().describe('Human-readable emotion label (e.g., "curiosity", "frustration", "relief")'), + }).optional().describe('Emotional affect - how this memory feels'), + private: z.boolean().optional().describe('If true, memory is only visible to the AI and excluded from public recall results'), +}; + +export const memoryRecallToolSchema = { + query: z.string().describe('Search query to match against memories'), + limit: z.number().min(1).max(50).optional().describe('Max results (default 10)'), + types: z.array(z.enum(VALID_TYPES)).optional().describe('Memory types to search'), + min_confidence: z.number().min(0).max(1).optional().describe('Minimum confidence threshold'), + tags: z.array(z.string()).optional().describe('Only return episodic memories with these tags'), + sources: z.array(z.enum(VALID_SOURCES)).optional().describe('Only return episodic memories from these sources'), + after: z.string().optional().describe('Only return memories created after this ISO date'), + before: z.string().optional().describe('Only return memories created before this ISO date'), + context: z.record(z.string(), z.string()).optional().describe('Retrieval context - memories encoded in matching context get boosted'), + mood: z.object({ + valence: z.number().min(-1).max(1).describe('Current emotional valence: -1 (negative) to 1 (positive)'), + arousal: z.number().min(0).max(1).optional().describe('Current arousal: 0 (calm) to 1 (activated)'), + }).optional().describe('Current mood - boosts recall of memories encoded in similar emotional state'), +}; + +export const memoryImportToolSchema = { + snapshot: z.object({ + version: z.string(), + episodes: z.array(z.any()), + semantics: z.array(z.any()).optional(), + procedures: z.array(z.any()).optional(), + causalLinks: z.array(z.any()).optional(), + contradictions: z.array(z.any()).optional(), + consolidationRuns: z.array(z.any()).optional(), + consolidationMetrics: z.array(z.any()).optional(), + config: z.record(z.string(), z.string()).optional(), + }).passthrough().describe('A snapshot from memory_export'), +}; + export const memoryForgetToolSchema = { id: z.string().optional().describe('ID of the memory to forget'), query: z.string().optional().describe('Semantic query to find and forget the closest matching memory'), @@ -115,1121 +118,1457 @@ export const memoryForgetToolSchema = { purge: z.boolean().optional().describe('Hard-delete the memory permanently (default false, soft-delete)'), }; -// --------------------------------------------------------------------------- -// Local interface for status reporting -// --------------------------------------------------------------------------- +export const memoryPreflightToolSchema = { + action: z.string() + .refine(isNonEmptyText, 'Action must not be empty') + .describe('Natural-language description of the action the agent is about to take.'), + tool: z.string().optional().describe('Tool or command family about to be used, e.g. Bash, npm test, Edit, deploy.'), + session_id: z.string().optional().describe('Session identifier for grouping the optional preflight event.'), + cwd: z.string().optional().describe('Working directory for the action.'), + files: z.array(z.string()).optional().describe('File paths to fingerprint if record_event is true.'), + strict: z.boolean().optional().describe('If true, high-severity memory warnings produce decision=block instead of caution.'), + limit: z.number().int().min(1).max(50).optional().describe('Max recall results to consider before preflight categorization.'), + budget_chars: z.number().int().min(200).max(32000).optional().describe('Capsule budget in characters.'), + mode: z.enum(['balanced', 'conservative', 'aggressive']).optional().describe('Underlying capsule mode. Defaults to conservative.'), + failure_window_hours: z.number().int().min(1).max(8760).optional().describe( + 'How far back to check failed tool events. Defaults to 168 hours.' + ), + include_status: z.boolean().optional().describe('Include memory health in the response and warning calculation. Defaults to true.'), + record_event: z.boolean().optional().describe('Record a redacted PreToolUse event for this preflight. Defaults to false.'), + include_capsule: z.boolean().optional().describe('If false, omit the embedded Memory Capsule from the response.'), +}; -interface StatusReport { - generatedAt: string; - registered: boolean; - dataDir: string; - exists: boolean; - storedDimensions: number | null; - stats: IntrospectResult | null; - health: MemoryStatusResult | null; - lastConsolidation: string | null; - error: string | null; -} +export const memoryReflexesToolSchema = { + ...memoryPreflightToolSchema, + include_preflight: z.boolean().optional().describe('If true, include the full underlying preflight report.'), +}; // --------------------------------------------------------------------------- -// CLI subcommands +// Local interface for status reporting // --------------------------------------------------------------------------- - -async function serveHttp(): Promise { - const { startServer } = await import('../src/server.js'); - const config = buildAudreyConfig(); - const port = parseInt(process.env.AUDREY_PORT || '7437', 10); - const apiKey = process.env.AUDREY_API_KEY; - - const server = await startServer({ port, config, apiKey }); - console.error(`[audrey-http] v${VERSION} serving on port ${server.port}`); - if (apiKey) { - console.error('[audrey-http] API key authentication enabled'); - } -} - -async function reembed(): Promise { - const dataDir = resolveDataDir(process.env); - const explicit = process.env['AUDREY_EMBEDDING_PROVIDER']; - const embedding = resolveEmbeddingProvider(process.env, explicit); - const storedDims = readStoredDimensions(dataDir); - const dimensionsChanged = storedDims !== null && storedDims !== embedding.dimensions; - - console.log(`Re-embedding with ${embedding.provider} (${embedding.dimensions}d)...`); - if (dimensionsChanged) { - console.log(`Dimension change: ${storedDims}d -> ${embedding.dimensions}d (will drop and recreate vec tables)`); - } - - const audrey = new Audrey({ dataDir, agent: 'reembed', embedding }); - try { - await initializeEmbeddingProvider(audrey.embeddingProvider); - const { reembedAll } = await import('../src/migrate.js'); - const counts = await reembedAll(audrey.db, audrey.embeddingProvider, { dropAndRecreate: dimensionsChanged }); - console.log(`Done. Re-embedded: ${counts.episodes} episodes, ${counts.semantics} semantics, ${counts.procedures} procedures`); - } finally { - audrey.close(); - } -} - -async function dream(): Promise { - const dataDir = resolveDataDir(process.env); - const explicit = process.env['AUDREY_EMBEDDING_PROVIDER']; - const embedding = resolveEmbeddingProvider(process.env, explicit); - const storedDims = readStoredDimensions(dataDir); - - const config: AudreyConfig = { - dataDir, - agent: 'dream', - embedding, - }; - - const llm = resolveLLMProvider(process.env, process.env['AUDREY_LLM_PROVIDER']); - if (llm) config.llm = llm as AudreyConfig['llm']; - - const audrey = new Audrey(config); - try { - await initializeEmbeddingProvider(audrey.embeddingProvider); - - const embeddingLabel = storedDims !== null && storedDims !== embedding.dimensions - ? `${embedding.provider} (${embedding.dimensions}d; stored ${storedDims}d)` - : `${embedding.provider} (${embedding.dimensions}d)`; - - console.log('[audrey] Starting dream cycle...'); - console.log(`[audrey] Embedding: ${embeddingLabel}`); - - const result = await audrey.dream(); - const health = audrey.memoryStatus(); - - console.log( - `[audrey] Consolidation: evaluated ${result.consolidation.episodesEvaluated} episodes, ` - + `found ${result.consolidation.clustersFound} clusters, extracted ${result.consolidation.principlesExtracted} principles ` - + `(${result.consolidation.semanticsCreated ?? 0} semantic, ${result.consolidation.proceduresCreated ?? 0} procedural)` - ); - console.log( - `[audrey] Decay: evaluated ${result.decay.totalEvaluated} memories, ` - + `${result.decay.transitionedToDormant} transitioned to dormant` - ); - console.log( - `[audrey] Final: ${result.stats.episodic} episodic, ${result.stats.semantic} semantic, ${result.stats.procedural} procedural ` - + `| ${health.healthy ? 'healthy' : 'unhealthy'}` - ); - console.log('[audrey] Dream complete.'); - } finally { - audrey.close(); - } -} - -async function greeting(): Promise { - const dataDir = resolveDataDir(process.env); - const contextArg = process.argv[3] || undefined; - - if (!existsSync(dataDir)) { - console.log('[audrey] No data yet - fresh start.'); - return; - } - - const storedDimensions = readStoredDimensions(dataDir); - const resolvedEmbedding = resolveEmbeddingProvider(process.env, process.env['AUDREY_EMBEDDING_PROVIDER']); - const canUseResolvedEmbedding = Boolean(contextArg) - && storedDimensions !== null - && storedDimensions === resolvedEmbedding.dimensions; - const dimensions = storedDimensions || resolvedEmbedding.dimensions || 8; - const audrey = new Audrey({ - dataDir, - agent: 'greeting', - embedding: canUseResolvedEmbedding - ? resolvedEmbedding - : { provider: 'mock' as const, dimensions }, - }); - - try { - if (canUseResolvedEmbedding) { - await initializeEmbeddingProvider(audrey.embeddingProvider); - } - const result = await audrey.greeting({ context: canUseResolvedEmbedding ? contextArg : undefined }); - const health = audrey.memoryStatus(); - - const lines: string[] = []; - lines.push(`[Audrey v${VERSION}] Memory briefing`); - lines.push(''); - - if (contextArg && !canUseResolvedEmbedding) { + +interface StatusReport { + generatedAt: string; + registered: boolean; + dataDir: string; + exists: boolean; + storedDimensions: number | null; + stats: IntrospectResult | null; + health: MemoryStatusResult | null; + lastConsolidation: string | null; + error: string | null; +} + +// --------------------------------------------------------------------------- +// CLI subcommands +// --------------------------------------------------------------------------- + +async function serveHttp(): Promise { + const { startServer } = await import('../src/server.js'); + const config = buildAudreyConfig(); + const port = parseInt(process.env.AUDREY_PORT || '7437', 10); + const apiKey = process.env.AUDREY_API_KEY; + + const server = await startServer({ port, config, apiKey }); + console.error(`[audrey-http] v${VERSION} serving on port ${server.port}`); + if (apiKey) { + console.error('[audrey-http] API key authentication enabled'); + } +} + +async function reembed(): Promise { + const dataDir = resolveDataDir(process.env); + const explicit = process.env['AUDREY_EMBEDDING_PROVIDER']; + const embedding = resolveEmbeddingProvider(process.env, explicit); + const storedDims = readStoredDimensions(dataDir); + const dimensionsChanged = storedDims !== null && storedDims !== embedding.dimensions; + + console.log(`Re-embedding with ${embedding.provider} (${embedding.dimensions}d)...`); + if (dimensionsChanged) { + console.log(`Dimension change: ${storedDims}d -> ${embedding.dimensions}d (will drop and recreate vec tables)`); + } + + const audrey = new Audrey({ dataDir, agent: 'reembed', embedding }); + try { + await initializeEmbeddingProvider(audrey.embeddingProvider); + const { reembedAll } = await import('../src/migrate.js'); + const counts = await reembedAll(audrey.db, audrey.embeddingProvider, { dropAndRecreate: dimensionsChanged }); + console.log(`Done. Re-embedded: ${counts.episodes} episodes, ${counts.semantics} semantics, ${counts.procedures} procedures`); + } finally { + audrey.close(); + } +} + +async function dream(): Promise { + const dataDir = resolveDataDir(process.env); + const explicit = process.env['AUDREY_EMBEDDING_PROVIDER']; + const embedding = resolveEmbeddingProvider(process.env, explicit); + const storedDims = readStoredDimensions(dataDir); + + const config: AudreyConfig = { + dataDir, + agent: 'dream', + embedding, + }; + + const llm = resolveLLMProvider(process.env, process.env['AUDREY_LLM_PROVIDER']); + if (llm) config.llm = llm as AudreyConfig['llm']; + + const audrey = new Audrey(config); + try { + await initializeEmbeddingProvider(audrey.embeddingProvider); + + const embeddingLabel = storedDims !== null && storedDims !== embedding.dimensions + ? `${embedding.provider} (${embedding.dimensions}d; stored ${storedDims}d)` + : `${embedding.provider} (${embedding.dimensions}d)`; + + console.log('[audrey] Starting dream cycle...'); + console.log(`[audrey] Embedding: ${embeddingLabel}`); + + const result = await audrey.dream(); + const health = audrey.memoryStatus(); + + console.log( + `[audrey] Consolidation: evaluated ${result.consolidation.episodesEvaluated} episodes, ` + + `found ${result.consolidation.clustersFound} clusters, extracted ${result.consolidation.principlesExtracted} principles ` + + `(${result.consolidation.semanticsCreated ?? 0} semantic, ${result.consolidation.proceduresCreated ?? 0} procedural)` + ); + console.log( + `[audrey] Decay: evaluated ${result.decay.totalEvaluated} memories, ` + + `${result.decay.transitionedToDormant} transitioned to dormant` + ); + console.log( + `[audrey] Final: ${result.stats.episodic} episodic, ${result.stats.semantic} semantic, ${result.stats.procedural} procedural ` + + `| ${health.healthy ? 'healthy' : 'unhealthy'}` + ); + console.log('[audrey] Dream complete.'); + } finally { + audrey.close(); + } +} + +async function greeting(): Promise { + const dataDir = resolveDataDir(process.env); + const contextArg = process.argv[3] || undefined; + + if (!existsSync(dataDir)) { + console.log('[audrey] No data yet - fresh start.'); + return; + } + + const storedDimensions = readStoredDimensions(dataDir); + const resolvedEmbedding = resolveEmbeddingProvider(process.env, process.env['AUDREY_EMBEDDING_PROVIDER']); + const canUseResolvedEmbedding = Boolean(contextArg) + && storedDimensions !== null + && storedDimensions === resolvedEmbedding.dimensions; + const dimensions = storedDimensions || resolvedEmbedding.dimensions || 8; + const audrey = new Audrey({ + dataDir, + agent: 'greeting', + embedding: canUseResolvedEmbedding + ? resolvedEmbedding + : { provider: 'mock' as const, dimensions }, + }); + + try { + if (canUseResolvedEmbedding) { + await initializeEmbeddingProvider(audrey.embeddingProvider); + } + const result = await audrey.greeting({ context: canUseResolvedEmbedding ? contextArg : undefined }); + const health = audrey.memoryStatus(); + + const lines: string[] = []; + lines.push(`[Audrey v${VERSION}] Memory briefing`); + lines.push(''); + + if (contextArg && !canUseResolvedEmbedding) { + lines.push( + `Context recall skipped: stored index is ${storedDimensions ?? 'unknown'}d ` + + `but current embedding config resolves to ${resolvedEmbedding.dimensions}d.` + ); + lines.push(''); + } + + // Mood + if (result.mood && result.mood.samples > 0) { + const v = result.mood.valence; + const moodWord = v > 0.3 ? 'positive' : v < -0.3 ? 'negative' : 'neutral'; lines.push( - `Context recall skipped: stored index is ${storedDimensions ?? 'unknown'}d ` - + `but current embedding config resolves to ${resolvedEmbedding.dimensions}d.` + `Mood: ${moodWord} (valence=${v.toFixed(2)}, ` + + `arousal=${result.mood.arousal.toFixed(2)}, ` + + `from ${result.mood.samples} recent memories)` ); - lines.push(''); - } - - // Mood - if (result.mood && result.mood.samples > 0) { - const v = result.mood.valence; - const moodWord = v > 0.3 ? 'positive' : v < -0.3 ? 'negative' : 'neutral'; - lines.push(`Mood: ${moodWord} (valence=${v.toFixed(2)}, arousal=${result.mood.arousal.toFixed(2)}, from ${result.mood.samples} recent memories)`); - } - - // Health - const stats = audrey.introspect(); - lines.push(`Memory: ${stats.episodic} episodic, ${stats.semantic} semantic, ${stats.procedural} procedural | ${health.healthy ? 'healthy' : 'needs attention'}`); - lines.push(''); - - // Principles (semantic memories) - if (result.principles?.length > 0) { - lines.push('Learned principles:'); - for (const p of result.principles) { - lines.push(` - ${p.content}`); - } - lines.push(''); - } - - // Identity (private memories) - if (result.identity?.length > 0) { - lines.push('Identity:'); - for (const m of result.identity) { - lines.push(` - ${m.content}`); - } - lines.push(''); - } - - // Recent memories - if (result.recent?.length > 0) { - lines.push('Recent memories:'); - for (const r of result.recent) { - const age = timeSince(r.created_at); - lines.push(` - [${age}] ${r.content.slice(0, 200)}`); - } - lines.push(''); - } - - // Unresolved - if (result.unresolved?.length > 0) { - lines.push('Unresolved threads:'); - for (const u of result.unresolved) { - lines.push(` - ${u.content.slice(0, 150)}`); - } - lines.push(''); - } - - // Contextual recall - if ((result.contextual?.length ?? 0) > 0) { - lines.push(`Context-relevant memories (query: "${contextArg}"):`); - for (const c of result.contextual!) { - lines.push(` - [${c.type}] ${c.content.slice(0, 200)}`); - } - lines.push(''); - } - - console.log(lines.join('\n')); - } finally { - audrey.close(); - } -} - -function timeSince(isoDate: string): string { - const ms = Date.now() - new Date(isoDate).getTime(); - const mins = Math.floor(ms / 60000); - if (mins < 60) return `${mins}m ago`; - const hours = Math.floor(mins / 60); - if (hours < 24) return `${hours}h ago`; - const days = Math.floor(hours / 24); - return `${days}d ago`; -} - -async function reflect(): Promise { - const dataDir = resolveDataDir(process.env); - const explicit = process.env['AUDREY_EMBEDDING_PROVIDER']; - const embedding = resolveEmbeddingProvider(process.env, explicit); - - const config: AudreyConfig = { - dataDir, - agent: 'reflect', - embedding, - }; - - const llm = resolveLLMProvider(process.env, process.env['AUDREY_LLM_PROVIDER']); - if (llm) config.llm = llm as AudreyConfig['llm']; - - const audrey = new Audrey(config); - try { - await initializeEmbeddingProvider(audrey.embeddingProvider); - - // Read conversation turns from stdin if available - let turns: unknown[] | null = null; - if (!process.stdin.isTTY) { - const chunks: Buffer[] = []; - for await (const chunk of process.stdin) { - chunks.push(chunk as Buffer); - } - const raw = Buffer.concat(chunks).toString('utf-8').trim(); - if (raw) { - try { - turns = JSON.parse(raw) as unknown[]; - } catch { - console.error('[audrey] Could not parse stdin as JSON turns, skipping reflect.'); - } - } - } - - if (turns && Array.isArray(turns) && turns.length > 0) { - console.log(`[audrey] Reflecting on ${turns.length} conversation turns...`); - const reflectResult = await audrey.reflect(turns as Array<{ role: string; content: string }>); - if (reflectResult.skipped) { - console.log(`[audrey] Reflect skipped: ${reflectResult.skipped}`); - } else { - console.log(`[audrey] Reflected: encoded ${reflectResult.encoded} lasting memories.`); - } - } - - // Always run dream cycle after reflect - console.log('[audrey] Starting dream cycle...'); - const result = await audrey.dream(); - console.log( - `[audrey] Consolidation: ${result.consolidation.episodesEvaluated} episodes evaluated, ` - + `${result.consolidation.clustersFound} clusters, ${result.consolidation.principlesExtracted} principles` - ); - console.log( - `[audrey] Decay: ${result.decay.totalEvaluated} evaluated, ` - + `${result.decay.transitionedToDormant} dormant` - ); - console.log( - `[audrey] Status: ${result.stats.episodic} episodic, ${result.stats.semantic} semantic, ` - + `${result.stats.procedural} procedural` + } + + // Health + const stats = audrey.introspect(); + lines.push( + `Memory: ${stats.episodic} episodic, ${stats.semantic} semantic, ` + + `${stats.procedural} procedural | ${health.healthy ? 'healthy' : 'needs attention'}` ); - console.log('[audrey] Dream complete.'); - } finally { - audrey.close(); - } -} - -function install(): void { - try { - execFileSync('claude', ['--version'], { stdio: 'ignore' }); - } catch { - console.error('Error: claude CLI not found. Install Claude Code first: https://docs.anthropic.com/en/docs/claude-code'); - process.exit(1); - } - - const dataDir = resolveDataDir(process.env); - const resolvedEmbedding = resolveEmbeddingProvider(process.env, process.env['AUDREY_EMBEDDING_PROVIDER']); - const resolvedLlm = resolveLLMProvider(process.env, process.env['AUDREY_LLM_PROVIDER']); - if (resolvedEmbedding.provider === 'gemini') { - console.log('Using Gemini embeddings (3072d)'); - } else if (resolvedEmbedding.provider === 'local') { - console.log(`Using local embeddings (384d, device=${resolvedEmbedding.device || 'gpu'})`); - } else if (resolvedEmbedding.provider === 'openai') { - console.log('Using OpenAI embeddings (1536d)'); - } else if (resolvedEmbedding.provider === 'mock') { - console.log('Using mock embeddings'); - } - - if (resolvedLlm?.provider === 'anthropic') { - console.log('Using Anthropic for LLM-powered consolidation, contradiction detection, and reflection'); - } else if (resolvedLlm?.provider === 'openai') { - console.log('Using OpenAI for LLM-powered consolidation, contradiction detection, and reflection'); - } else if (resolvedLlm?.provider === 'mock') { - console.log('Using mock LLM provider'); - } else { - console.log('No LLM provider configured - consolidation and contradiction detection will use heuristics'); - } - - try { - execFileSync('claude', ['mcp', 'remove', SERVER_NAME], { stdio: 'ignore' }); - } catch { - // Not registered yet. - } - - const args = buildInstallArgs(process.env); - try { - execFileSync('claude', args, { stdio: 'inherit' }); - } catch { - console.error('Failed to register MCP server. Is Claude Code installed and on your PATH?'); - process.exit(1); - } - - console.log(` + lines.push(''); + + // Principles (semantic memories) + if (result.principles?.length > 0) { + lines.push('Learned principles:'); + for (const p of result.principles) { + lines.push(` - ${p.content}`); + } + lines.push(''); + } + + // Identity (private memories) + if (result.identity?.length > 0) { + lines.push('Identity:'); + for (const m of result.identity) { + lines.push(` - ${m.content}`); + } + lines.push(''); + } + + // Recent memories + if (result.recent?.length > 0) { + lines.push('Recent memories:'); + for (const r of result.recent) { + const age = timeSince(r.created_at); + lines.push(` - [${age}] ${r.content.slice(0, 200)}`); + } + lines.push(''); + } + + // Unresolved + if (result.unresolved?.length > 0) { + lines.push('Unresolved threads:'); + for (const u of result.unresolved) { + lines.push(` - ${u.content.slice(0, 150)}`); + } + lines.push(''); + } + + // Contextual recall + if ((result.contextual?.length ?? 0) > 0) { + lines.push(`Context-relevant memories (query: "${contextArg}"):`); + for (const c of result.contextual!) { + lines.push(` - [${c.type}] ${c.content.slice(0, 200)}`); + } + lines.push(''); + } + + console.log(lines.join('\n')); + } finally { + audrey.close(); + } +} + +function timeSince(isoDate: string): string { + const ms = Date.now() - new Date(isoDate).getTime(); + const mins = Math.floor(ms / 60000); + if (mins < 60) return `${mins}m ago`; + const hours = Math.floor(mins / 60); + if (hours < 24) return `${hours}h ago`; + const days = Math.floor(hours / 24); + return `${days}d ago`; +} + +async function reflect(): Promise { + const dataDir = resolveDataDir(process.env); + const explicit = process.env['AUDREY_EMBEDDING_PROVIDER']; + const embedding = resolveEmbeddingProvider(process.env, explicit); + + const config: AudreyConfig = { + dataDir, + agent: 'reflect', + embedding, + }; + + const llm = resolveLLMProvider(process.env, process.env['AUDREY_LLM_PROVIDER']); + if (llm) config.llm = llm as AudreyConfig['llm']; + + const audrey = new Audrey(config); + try { + await initializeEmbeddingProvider(audrey.embeddingProvider); + + // Read conversation turns from stdin if available + let turns: unknown[] | null = null; + if (!process.stdin.isTTY) { + const chunks: Buffer[] = []; + for await (const chunk of process.stdin) { + chunks.push(chunk as Buffer); + } + const raw = Buffer.concat(chunks).toString('utf-8').trim(); + if (raw) { + try { + turns = JSON.parse(raw) as unknown[]; + } catch { + console.error('[audrey] Could not parse stdin as JSON turns, skipping reflect.'); + } + } + } + + if (turns && Array.isArray(turns) && turns.length > 0) { + console.log(`[audrey] Reflecting on ${turns.length} conversation turns...`); + const reflectResult = await audrey.reflect(turns as Array<{ role: string; content: string }>); + if (reflectResult.skipped) { + console.log(`[audrey] Reflect skipped: ${reflectResult.skipped}`); + } else { + console.log(`[audrey] Reflected: encoded ${reflectResult.encoded} lasting memories.`); + } + } + + // Always run dream cycle after reflect + console.log('[audrey] Starting dream cycle...'); + const result = await audrey.dream(); + console.log( + `[audrey] Consolidation: ${result.consolidation.episodesEvaluated} episodes evaluated, ` + + `${result.consolidation.clustersFound} clusters, ${result.consolidation.principlesExtracted} principles` + ); + console.log( + `[audrey] Decay: ${result.decay.totalEvaluated} evaluated, ` + + `${result.decay.transitionedToDormant} dormant` + ); + console.log( + `[audrey] Status: ${result.stats.episodic} episodic, ${result.stats.semantic} semantic, ` + + `${result.stats.procedural} procedural` + ); + console.log('[audrey] Dream complete.'); + } finally { + audrey.close(); + } +} + +function install(): void { + try { + execFileSync('claude', ['--version'], { stdio: 'ignore' }); + } catch { + console.error('Error: claude CLI not found. Install Claude Code first: https://docs.anthropic.com/en/docs/claude-code'); + process.exit(1); + } + + const dataDir = resolveDataDir(process.env); + const resolvedEmbedding = resolveEmbeddingProvider(process.env, process.env['AUDREY_EMBEDDING_PROVIDER']); + const resolvedLlm = resolveLLMProvider(process.env, process.env['AUDREY_LLM_PROVIDER']); + if (resolvedEmbedding.provider === 'gemini') { + console.log('Using Gemini embeddings (3072d)'); + } else if (resolvedEmbedding.provider === 'local') { + console.log(`Using local embeddings (384d, device=${resolvedEmbedding.device || 'gpu'})`); + } else if (resolvedEmbedding.provider === 'openai') { + console.log('Using OpenAI embeddings (1536d)'); + } else if (resolvedEmbedding.provider === 'mock') { + console.log('Using mock embeddings'); + } + + if (resolvedLlm?.provider === 'anthropic') { + console.log('Using Anthropic for LLM-powered consolidation, contradiction detection, and reflection'); + } else if (resolvedLlm?.provider === 'openai') { + console.log('Using OpenAI for LLM-powered consolidation, contradiction detection, and reflection'); + } else if (resolvedLlm?.provider === 'mock') { + console.log('Using mock LLM provider'); + } else { + console.log('No LLM provider configured - consolidation and contradiction detection will use heuristics'); + } + + try { + execFileSync('claude', ['mcp', 'remove', SERVER_NAME], { stdio: 'ignore' }); + } catch { + // Not registered yet. + } + + const args = buildInstallArgs(process.env); + try { + execFileSync('claude', args, { stdio: 'inherit' }); + } catch { + console.error('Failed to register MCP server. Is Claude Code installed and on your PATH?'); + process.exit(1); + } + + console.log(` Audrey registered as "${SERVER_NAME}" with Claude Code. -13 MCP tools available in every session: +19 MCP tools available in every session: memory_encode - Store observations, facts, preferences memory_recall - Search memories by semantic similarity memory_consolidate - Extract principles from accumulated episodes - memory_dream - Full sleep cycle: consolidate + decay + stats - memory_introspect - Check memory system health - memory_resolve_truth - Resolve contradictions between claims - memory_export - Export all memories as JSON snapshot - memory_import - Import a snapshot into a fresh database - memory_forget - Forget a specific memory by ID or query - memory_decay - Apply forgetting curves, transition low-confidence to dormant + memory_dream - Full sleep cycle: consolidate + decay + stats + memory_introspect - Check memory system health + memory_resolve_truth - Resolve contradictions between claims + memory_export - Export all memories as JSON snapshot + memory_import - Import a snapshot into a fresh database + memory_forget - Forget a specific memory by ID or query + memory_decay - Apply forgetting curves, transition low-confidence to dormant memory_status - Check brain health (episode/vec sync, dimensions) memory_reflect - Form lasting memories from a conversation memory_greeting - Wake up as yourself: load identity, context, mood + memory_observe_tool - Record redacted tool-use events + memory_recent_failures - Inspect recent failed tool events + memory_capsule - Return a ranked, evidence-backed memory packet + memory_preflight - Check memory before an agent acts + memory_reflexes - Convert preflight evidence into trigger-response reflexes + memory_promote - Promote repeated lessons into project rules CLI subcommands: + npx audrey demo - Run a 60-second local proof with no network calls npx audrey install - Register MCP server with Claude Code + npx audrey mcp-config codex - Print Codex MCP TOML + npx audrey mcp-config generic - Print JSON config for other MCP hosts npx audrey uninstall - Remove MCP server registration npx audrey status - Show memory store health and stats - npx audrey status --json - Emit machine-readable health output - npx audrey status --json --fail-on-unhealthy - Exit non-zero on unhealthy status - npx audrey greeting - Output session briefing (for hooks) - npx audrey reflect - Reflect on conversation + dream cycle (for hooks) - npx audrey dream - Run consolidation + decay cycle + npx audrey status --json - Emit machine-readable health output + npx audrey status --json --fail-on-unhealthy - Exit non-zero on unhealthy status + npx audrey greeting - Output session briefing (for hooks) + npx audrey reflect - Reflect on conversation + dream cycle (for hooks) + npx audrey dream - Run consolidation + decay cycle npx audrey reembed - Re-embed all memories with current provider Data stored in: ${dataDir} Verify: claude mcp list -`); -} - -function uninstall(): void { - try { - execFileSync('claude', ['--version'], { stdio: 'ignore' }); - } catch { - console.error('Error: claude CLI not found.'); - process.exit(1); +`); +} + +function uninstall(): void { + try { + execFileSync('claude', ['--version'], { stdio: 'ignore' }); + } catch { + console.error('Error: claude CLI not found.'); + process.exit(1); + } + + try { + execFileSync('claude', ['mcp', 'remove', SERVER_NAME], { stdio: 'inherit' }); + console.log(`Removed "${SERVER_NAME}" from Claude Code.`); + } catch { + console.error(`Failed to remove "${SERVER_NAME}". It may not be registered.`); + process.exit(1); } +} +function printMcpConfig(): void { + const host = process.argv[3] || 'generic'; try { - execFileSync('claude', ['mcp', 'remove', SERVER_NAME], { stdio: 'inherit' }); - console.log(`Removed "${SERVER_NAME}" from Claude Code.`); - } catch { - console.error(`Failed to remove "${SERVER_NAME}". It may not be registered.`); - process.exit(1); + console.log(formatMcpHostConfig(host, process.env)); + } catch (err) { + const message = err instanceof Error ? err.message : String(err); + console.error(`[audrey] mcp-config failed: ${message}`); + process.exit(2); } } -function cliHasFlag(flag: string, argv: string[] = process.argv): boolean { - return Array.isArray(argv) && argv.includes(flag); +function sectionTitle(section: string): string { + return section.replace(/_/g, ' '); } -export function buildStatusReport({ - dataDir = resolveDataDir(process.env), - claudeJsonPath = join(homedir(), '.claude.json'), -}: { dataDir?: string; claudeJsonPath?: string } = {}): StatusReport { - let registered = false; +function createDemoDir(): string { + const preferredParent = process.env['AUDREY_DEMO_PARENT_DIR'] || tmpdir(); try { - const claudeConfig = JSON.parse(readFileSync(claudeJsonPath, 'utf-8')) as { mcpServers?: Record }; - registered = SERVER_NAME in (claudeConfig.mcpServers || {}); + return mkdtempSync(join(preferredParent, 'audrey-demo-')); } catch { - // Ignore unreadable config. - } - - const report: StatusReport = { - generatedAt: new Date().toISOString(), - registered, - dataDir, - exists: existsSync(dataDir), - storedDimensions: null, - stats: null, - health: null, - lastConsolidation: null, - error: null, - }; - - if (!report.exists) { - return report; + const fallbackParent = join(process.cwd(), '.audrey-demo-tmp'); + mkdirSync(fallbackParent, { recursive: true }); + return mkdtempSync(join(fallbackParent, 'run-')); } - - try { - report.storedDimensions = readStoredDimensions(dataDir); - const dimensions = report.storedDimensions || 8; - const audrey = new Audrey({ - dataDir, - agent: 'status-check', - embedding: { provider: 'mock', dimensions }, - }); - report.stats = audrey.introspect(); - report.health = audrey.memoryStatus(); - report.lastConsolidation = (audrey.db.prepare(` - SELECT completed_at FROM consolidation_runs - WHERE status = 'completed' - ORDER BY completed_at DESC - LIMIT 1 - `).get() as { completed_at?: string } | undefined)?.completed_at ?? 'never'; - audrey.close(); - } catch (err) { - report.error = (err as Error).message || String(err); - } - - return report; } -export function formatStatusReport(report: StatusReport): string { - const lines: string[] = []; - lines.push(`Registration: ${report.registered ? 'active' : 'not registered'}`); - - if (!report.exists) { - lines.push(`Data directory: ${report.dataDir} (not yet created - will be created on first use)`); - return lines.join('\n'); - } - - if (report.error) { - lines.push(`Data directory: ${report.dataDir} (exists but could not read: ${report.error})`); - return lines.join('\n'); - } - - lines.push(`Data directory: ${report.dataDir}`); - lines.push(`Stored dimensions: ${report.storedDimensions ?? 'unknown'}`); - lines.push( - `Memories: ${report.stats!.episodic} episodic, ${report.stats!.semantic} semantic, ${report.stats!.procedural} procedural` - ); - lines.push( - `Index sync: ${report.health!.vec_episodes}/${report.health!.searchable_episodes} episodic, ` - + `${report.health!.vec_semantics}/${report.health!.searchable_semantics} semantic, ` - + `${report.health!.vec_procedures}/${report.health!.searchable_procedures} procedural` - ); - lines.push( - `Health: ${report.health!.healthy ? 'healthy' : 'unhealthy'}` - + `${report.health!.reembed_recommended ? ' (re-embed recommended)' : ''}` - ); - lines.push(`Dormant: ${report.stats!.dormant}`); - lines.push(`Causal links: ${report.stats!.causalLinks}`); - lines.push(`Contradictions: ${report.stats!.contradictions.open} open, ${report.stats!.contradictions.resolved} resolved`); - lines.push(`Consolidation runs: ${report.stats!.totalConsolidationRuns}`); - lines.push(`Last consolidation: ${report.lastConsolidation}`); - - return lines.join('\n'); -} - -export function runStatusCommand({ - argv = process.argv, - dataDir = resolveDataDir(process.env), - claudeJsonPath = join(homedir(), '.claude.json'), +export async function runDemoCommand({ out = console.log, + keep = process.argv.includes('--keep'), }: { - argv?: string[]; - dataDir?: string; - claudeJsonPath?: string; out?: (...args: unknown[]) => void; -} = {}): { report: StatusReport; exitCode: number } { - const report = buildStatusReport({ dataDir, claudeJsonPath }); - if (cliHasFlag('--json', argv)) { - out(JSON.stringify(report, null, 2)); - } else { - out(formatStatusReport(report)); - } - - const exitCode = report.error - || (cliHasFlag('--fail-on-unhealthy', argv) && report.exists && report.health && !report.health.healthy) - ? 1 - : 0; - - return { report, exitCode }; -} - -function status(): void { - const { exitCode } = runStatusCommand(); - if (exitCode !== 0) { - process.exitCode = exitCode; - } -} - -function toolResult(data: unknown): { content: Array<{ type: 'text'; text: string }> } { - return { content: [{ type: 'text' as const, text: JSON.stringify(data) }] }; -} - -function toolError(err: unknown): { isError: boolean; content: Array<{ type: 'text'; text: string }> } { - return { isError: true, content: [{ type: 'text' as const, text: `Error: ${(err as Error).message || String(err)}` }] }; -} - -export function registerShutdownHandlers( - processRef: NodeJS.Process, - audrey: Audrey, - logger: (...args: unknown[]) => void = console.error, -): (message?: string, exitCode?: number) => void { - let closed = false; - - const shutdown = (message?: string, exitCode = 0): void => { - if (message) { - logger(message); - } - if (!closed) { - closed = true; - try { - audrey.close(); - } catch (err) { - logger(`[audrey-mcp] shutdown error: ${(err as Error).message || String(err)}`); - exitCode = exitCode === 0 ? 1 : exitCode; - } - } - if (typeof processRef.exit === 'function') { - processRef.exit(exitCode); - } - }; - - processRef.once('SIGINT', () => shutdown('[audrey-mcp] received SIGINT, shutting down')); - processRef.once('SIGTERM', () => shutdown('[audrey-mcp] received SIGTERM, shutting down')); - processRef.once('SIGHUP', () => shutdown('[audrey-mcp] received SIGHUP, shutting down')); - processRef.once('uncaughtException', (err: Error) => { - logger('[audrey-mcp] uncaught exception:', err); - shutdown(undefined, 1); - }); - processRef.once('unhandledRejection', (reason: unknown) => { - logger('[audrey-mcp] unhandled rejection:', reason); - shutdown(undefined, 1); - }); - - return shutdown; -} - -// eslint-disable-next-line @typescript-eslint/no-explicit-any -export function registerDreamTool(server: any, audrey: Audrey): void { - server.tool( - 'memory_dream', - { - min_cluster_size: z.number().optional().describe('Minimum episodes per cluster (default 3)'), - similarity_threshold: z.number().optional().describe('Similarity threshold for clustering (default 0.85)'), - dormant_threshold: z.number().min(0).max(1).optional().describe('Confidence below which memories go dormant (default 0.1)'), - }, - async ({ min_cluster_size, similarity_threshold, dormant_threshold }: { - min_cluster_size?: number; - similarity_threshold?: number; - dormant_threshold?: number; - }) => { - try { - const result = await audrey.dream({ - minClusterSize: min_cluster_size, - similarityThreshold: similarity_threshold, - dormantThreshold: dormant_threshold, - }); - return toolResult(result); - } catch (err) { - return toolError(err); - } - }, - ); -} - -async function main(): Promise { - const { McpServer } = await import('@modelcontextprotocol/sdk/server/mcp.js'); - const { StdioServerTransport } = await import('@modelcontextprotocol/sdk/server/stdio.js'); - const config = buildAudreyConfig(); - const audrey = new Audrey(config); - - const embLabel = config.embedding?.provider === 'mock' - ? 'mock embeddings - set OPENAI_API_KEY for real semantic search' - : `${config.embedding?.provider} embeddings (${config.embedding?.dimensions}d)`; - console.error(`[audrey-mcp] v${VERSION} started - agent=${config.agent} dataDir=${config.dataDir} (${embLabel})`); - - const server = new McpServer({ - name: SERVER_NAME, - version: VERSION, - }); - - server.tool('memory_encode', memoryEncodeToolSchema, async ({ content, source, tags, salience, private: isPrivate, context, affect }) => { - try { - validateMemoryContent(content); - const id = await audrey.encode({ content, source, tags, salience, private: isPrivate, context, affect }); - return toolResult({ id, content, source, private: isPrivate ?? false }); - } catch (err) { - return toolError(err); - } - }); - - server.tool('memory_recall', memoryRecallToolSchema, async ({ query, limit, types, min_confidence, tags, sources, after, before, context, mood }) => { - try { - const results = await audrey.recall(query, { - limit: limit ?? 10, - types, - minConfidence: min_confidence, - tags, - sources, - after, - before, - context, - mood, - }); - return toolResult(results); - } catch (err) { - return toolError(err); - } + keep?: boolean; +} = {}): Promise { + const demoDir = createDemoDir(); + const audrey = new Audrey({ + dataDir: demoDir, + agent: 'audrey-demo', + embedding: { provider: 'mock', dimensions: 64 }, + llm: { provider: 'mock' }, }); - server.tool('memory_consolidate', { - min_cluster_size: z.number().optional().describe('Minimum episodes per cluster'), - similarity_threshold: z.number().optional().describe('Similarity threshold for clustering'), - }, async ({ min_cluster_size, similarity_threshold }) => { - try { - const consolidation = await audrey.consolidate({ - minClusterSize: min_cluster_size, - similarityThreshold: similarity_threshold, - }); - return toolResult(consolidation); - } catch (err) { - return toolError(err); - } - }); + try { + out('Audrey 60-second memory demo'); + out(''); + out(`Memory store: ${demoDir}`); + out('Writing memories that could have come from Codex, Claude, or an Ollama agent...'); + + const ids: string[] = []; + ids.push(await audrey.encode({ + content: 'Audrey should work across Codex, Claude Code, Claude Desktop, Cursor, and Ollama-backed local agents.', + source: 'direct-observation', + tags: ['must-follow', 'host-neutral', 'codex', 'ollama'], + })); + ids.push(await audrey.encode({ + content: 'Before an agent starts work, ask Audrey for a Memory Capsule and include the capsule in the model context.', + source: 'direct-observation', + tags: ['procedure', 'memory-capsule', 'agent-loop'], + })); + ids.push(await audrey.encode({ + content: 'If a host cannot auto-install Audrey, run npx audrey mcp-config codex ' + + 'or npx audrey mcp-config generic and paste the generated config.', + source: 'direct-observation', + tags: ['procedure', 'mcp', 'first-contact'], + })); + ids.push(await audrey.encode({ + content: 'Repeated tool failures should become procedural warnings before the agent retries the same risky action.', + source: 'direct-observation', + tags: ['risk', 'procedure', 'tool-trace'], + })); + ids.push(await audrey.encode({ + content: 'Memory Reflexes turn preflight evidence into trigger-response rules an agent can follow before tool use.', + source: 'direct-observation', + tags: ['procedure', 'memory-reflexes', 'agent-loop'], + })); + + const event = audrey.observeTool({ + event: 'PostToolUse', + tool: 'npm test', + outcome: 'failed', + errorSummary: 'Vitest can fail with spawn EPERM on locked-down Windows hosts; ' + + 'use build, typecheck, benchmarks, and direct dist smokes as the fallback evidence path.', + cwd: process.cwd(), + metadata: { demo: true, source: 'audrey demo' }, + }); - server.tool('memory_introspect', {}, async () => { - try { - return toolResult(audrey.introspect()); - } catch (err) { - return toolError(err); - } - }); + out(`Encoded ${ids.length} memories and 1 redacted tool trace (${event.event.id}).`); + out(''); - server.tool('memory_resolve_truth', { - contradiction_id: z.string().describe('ID of the contradiction to resolve'), - }, async ({ contradiction_id }) => { - try { - return toolResult(await audrey.resolveTruth(contradiction_id)); - } catch (err) { - return toolError(err); - } - }); + const query = 'How should an agent use Audrey with Codex and Ollama?'; + out(`Asking Audrey for a Memory Capsule: "${query}"`); + const capsule = await audrey.capsule(query, { + limit: 8, + budgetChars: 2400, + includeRisks: true, + includeContradictions: true, + }); - server.tool('memory_export', {}, async () => { - try { - return toolResult(audrey.export()); - } catch (err) { - return toolError(err); + out(''); + out('Capsule highlights:'); + let printed = 0; + for (const [name, entries] of Object.entries(capsule.sections)) { + if (!Array.isArray(entries) || entries.length === 0) continue; + printed += 1; + out(`- ${sectionTitle(name)}:`); + for (const entry of entries.slice(0, 2)) { + out(` * ${entry.content}`); + out(` why: ${entry.reason}`); + } } - }); - - server.tool('memory_import', memoryImportToolSchema, async ({ snapshot }) => { - try { - await audrey.import(snapshot as Parameters[0]); - return toolResult({ imported: true, stats: audrey.introspect() }); - } catch (err) { - return toolError(err); + if (printed === 0) { + out('- No capsule sections were populated. That is unexpected for this demo.'); } - }); - server.tool('memory_forget', memoryForgetToolSchema, async ({ id, query, min_similarity, purge }) => { - try { - validateForgetSelection(id, query); - let result; - if (id) { - result = audrey.forget(id, { purge: purge ?? false }); - } else { - result = await audrey.forgetByQuery(query!, { - minSimilarity: min_similarity ?? 0.9, - purge: purge ?? false, - }); - if (!result) { - return toolResult({ forgotten: false, reason: 'No memory found above similarity threshold' }); - } - } - return toolResult({ forgotten: true, ...result }); - } catch (err) { - return toolError(err); + const reflexReport = await audrey.reflexes('run npm test before release', { + tool: 'npm test', + includePreflight: false, + }); + out(''); + out('Memory Reflex proof:'); + const demoReflexes = [...reflexReport.reflexes].sort((a, b) => { + if (a.source === 'recent_failure' && b.source !== 'recent_failure') return -1; + if (b.source === 'recent_failure' && a.source !== 'recent_failure') return 1; + return 0; + }); + for (const reflex of demoReflexes.slice(0, 3)) { + out(`- ${reflex.trigger}`); + out(` ${reflex.response_type}: ${reflex.response}`); } - }); - server.tool('memory_decay', { - dormant_threshold: z.number().min(0).max(1).optional().describe('Confidence below which memories go dormant (default 0.1)'), - }, async ({ dormant_threshold }) => { - try { - return toolResult(audrey.decay({ dormantThreshold: dormant_threshold })); - } catch (err) { - return toolError(err); + const recall = await audrey.recall('Codex Ollama Memory Capsule host install', { limit: 3 }); + out(''); + out('Recall proof:'); + for (const memory of recall.slice(0, 3)) { + out(`- [${memory.type}] ${(memory.confidence * 100).toFixed(0)}% ${memory.content}`); } - }); - server.tool('memory_status', {}, async () => { - try { - return toolResult(audrey.memoryStatus()); - } catch (err) { - return toolError(err); + out(''); + out('Next steps:'); + out('- Codex: npx audrey mcp-config codex'); + out('- Any stdio MCP host: npx audrey mcp-config generic'); + out('- Ollama/local agents: npx audrey serve, then call /v1/reflexes, /v1/capsule, and /v1/recall as tools'); + if (keep) { + out(`- Demo data kept at: ${demoDir}`); } - }); - - server.tool('memory_reflect', { - turns: z.array(z.object({ - role: z.string().describe('Message role: user or assistant'), - content: z.string().describe('Message content'), - })).describe('Conversation turns to reflect on. Call at end of meaningful conversations to form lasting memories.'), - }, async ({ turns }) => { - try { - return toolResult(await audrey.reflect(turns)); - } catch (err) { - return toolError(err); + } finally { + audrey.close(); + if (!keep) { + rmSync(demoDir, { recursive: true, force: true }); } - }); - - registerDreamTool(server, audrey); + } +} - server.tool('memory_greeting', { - context: z.string().optional().describe('Optional hint about this session (e.g. "working on authentication feature"). If provided, also returns semantically relevant memories.'), - }, async ({ context }) => { - try { - return toolResult(await audrey.greeting({ context })); - } catch (err) { - return toolError(err); - } +function cliHasFlag(flag: string, argv: string[] = process.argv): boolean { + return Array.isArray(argv) && argv.includes(flag); +} + +export function buildStatusReport({ + dataDir = resolveDataDir(process.env), + claudeJsonPath = join(homedir(), '.claude.json'), +}: { dataDir?: string; claudeJsonPath?: string } = {}): StatusReport { + let registered = false; + try { + const claudeConfig = JSON.parse(readFileSync(claudeJsonPath, 'utf-8')) as { mcpServers?: Record }; + registered = SERVER_NAME in (claudeConfig.mcpServers || {}); + } catch { + // Ignore unreadable config. + } + + const report: StatusReport = { + generatedAt: new Date().toISOString(), + registered, + dataDir, + exists: existsSync(dataDir), + storedDimensions: null, + stats: null, + health: null, + lastConsolidation: null, + error: null, + }; + + if (!report.exists) { + return report; + } + + try { + report.storedDimensions = readStoredDimensions(dataDir); + const dimensions = report.storedDimensions || 8; + const audrey = new Audrey({ + dataDir, + agent: 'status-check', + embedding: { provider: 'mock', dimensions }, + }); + report.stats = audrey.introspect(); + report.health = audrey.memoryStatus(); + report.lastConsolidation = (audrey.db.prepare(` + SELECT completed_at FROM consolidation_runs + WHERE status = 'completed' + ORDER BY completed_at DESC + LIMIT 1 + `).get() as { completed_at?: string } | undefined)?.completed_at ?? 'never'; + audrey.close(); + } catch (err) { + report.error = (err as Error).message || String(err); + } + + return report; +} + +export function formatStatusReport(report: StatusReport): string { + const lines: string[] = []; + lines.push(`Registration: ${report.registered ? 'active' : 'not registered'}`); + + if (!report.exists) { + lines.push(`Data directory: ${report.dataDir} (not yet created - will be created on first use)`); + return lines.join('\n'); + } + + if (report.error) { + lines.push(`Data directory: ${report.dataDir} (exists but could not read: ${report.error})`); + return lines.join('\n'); + } + + lines.push(`Data directory: ${report.dataDir}`); + lines.push(`Stored dimensions: ${report.storedDimensions ?? 'unknown'}`); + lines.push( + `Memories: ${report.stats!.episodic} episodic, ${report.stats!.semantic} semantic, ${report.stats!.procedural} procedural` + ); + lines.push( + `Index sync: ${report.health!.vec_episodes}/${report.health!.searchable_episodes} episodic, ` + + `${report.health!.vec_semantics}/${report.health!.searchable_semantics} semantic, ` + + `${report.health!.vec_procedures}/${report.health!.searchable_procedures} procedural` + ); + lines.push( + `Health: ${report.health!.healthy ? 'healthy' : 'unhealthy'}` + + `${report.health!.reembed_recommended ? ' (re-embed recommended)' : ''}` + ); + lines.push(`Dormant: ${report.stats!.dormant}`); + lines.push(`Causal links: ${report.stats!.causalLinks}`); + lines.push(`Contradictions: ${report.stats!.contradictions.open} open, ${report.stats!.contradictions.resolved} resolved`); + lines.push(`Consolidation runs: ${report.stats!.totalConsolidationRuns}`); + lines.push(`Last consolidation: ${report.lastConsolidation}`); + + return lines.join('\n'); +} + +export function runStatusCommand({ + argv = process.argv, + dataDir = resolveDataDir(process.env), + claudeJsonPath = join(homedir(), '.claude.json'), + out = console.log, +}: { + argv?: string[]; + dataDir?: string; + claudeJsonPath?: string; + out?: (...args: unknown[]) => void; +} = {}): { report: StatusReport; exitCode: number } { + const report = buildStatusReport({ dataDir, claudeJsonPath }); + if (cliHasFlag('--json', argv)) { + out(JSON.stringify(report, null, 2)); + } else { + out(formatStatusReport(report)); + } + + const exitCode = report.error + || (cliHasFlag('--fail-on-unhealthy', argv) && report.exists && report.health && !report.health.healthy) + ? 1 + : 0; + + return { report, exitCode }; +} + +function status(): void { + const { exitCode } = runStatusCommand(); + if (exitCode !== 0) { + process.exitCode = exitCode; + } +} + +function toolResult(data: unknown): { content: Array<{ type: 'text'; text: string }> } { + return { content: [{ type: 'text' as const, text: JSON.stringify(data) }] }; +} + +function toolError(err: unknown): { isError: boolean; content: Array<{ type: 'text'; text: string }> } { + return { isError: true, content: [{ type: 'text' as const, text: `Error: ${(err as Error).message || String(err)}` }] }; +} + +export function registerShutdownHandlers( + processRef: NodeJS.Process, + audrey: Audrey, + logger: (...args: unknown[]) => void = console.error, +): (message?: string, exitCode?: number) => void { + let closed = false; + + const shutdown = (message?: string, exitCode = 0): void => { + if (message) { + logger(message); + } + if (!closed) { + closed = true; + try { + audrey.close(); + } catch (err) { + logger(`[audrey-mcp] shutdown error: ${(err as Error).message || String(err)}`); + exitCode = exitCode === 0 ? 1 : exitCode; + } + } + if (typeof processRef.exit === 'function') { + processRef.exit(exitCode); + } + }; + + processRef.once('SIGINT', () => shutdown('[audrey-mcp] received SIGINT, shutting down')); + processRef.once('SIGTERM', () => shutdown('[audrey-mcp] received SIGTERM, shutting down')); + processRef.once('SIGHUP', () => shutdown('[audrey-mcp] received SIGHUP, shutting down')); + processRef.once('uncaughtException', (err: Error) => { + logger('[audrey-mcp] uncaught exception:', err); + shutdown(undefined, 1); + }); + processRef.once('unhandledRejection', (reason: unknown) => { + logger('[audrey-mcp] unhandled rejection:', reason); + shutdown(undefined, 1); + }); + + return shutdown; +} + +// eslint-disable-next-line @typescript-eslint/no-explicit-any +export function registerDreamTool(server: any, audrey: Audrey): void { + server.tool( + 'memory_dream', + { + min_cluster_size: z.number().optional().describe('Minimum episodes per cluster (default 3)'), + similarity_threshold: z.number().optional().describe('Similarity threshold for clustering (default 0.85)'), + dormant_threshold: z.number().min(0).max(1).optional().describe('Confidence below which memories go dormant (default 0.1)'), + }, + async ({ min_cluster_size, similarity_threshold, dormant_threshold }: { + min_cluster_size?: number; + similarity_threshold?: number; + dormant_threshold?: number; + }) => { + try { + const result = await audrey.dream({ + minClusterSize: min_cluster_size, + similarityThreshold: similarity_threshold, + dormantThreshold: dormant_threshold, + }); + return toolResult(result); + } catch (err) { + return toolError(err); + } + }, + ); +} + +async function main(): Promise { + const { McpServer } = await import('@modelcontextprotocol/sdk/server/mcp.js'); + const { StdioServerTransport } = await import('@modelcontextprotocol/sdk/server/stdio.js'); + const config = buildAudreyConfig(); + const audrey = new Audrey(config); + + const embLabel = config.embedding?.provider === 'mock' + ? 'mock embeddings - set OPENAI_API_KEY for real semantic search' + : `${config.embedding?.provider} embeddings (${config.embedding?.dimensions}d)`; + console.error(`[audrey-mcp] v${VERSION} started - agent=${config.agent} dataDir=${config.dataDir} (${embLabel})`); + + const server = new McpServer({ + name: SERVER_NAME, + version: VERSION, + }); + + server.tool('memory_encode', memoryEncodeToolSchema, async ({ content, source, tags, salience, private: isPrivate, context, affect }) => { + try { + validateMemoryContent(content); + const id = await audrey.encode({ content, source, tags, salience, private: isPrivate, context, affect }); + return toolResult({ id, content, source, private: isPrivate ?? false }); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_recall', memoryRecallToolSchema, async ({ + query, + limit, + types, + min_confidence, + tags, + sources, + after, + before, + context, + mood, + }) => { + try { + const results = await audrey.recall(query, { + limit: limit ?? 10, + types, + minConfidence: min_confidence, + tags, + sources, + after, + before, + context, + mood, + }); + return toolResult(results); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_consolidate', { + min_cluster_size: z.number().optional().describe('Minimum episodes per cluster'), + similarity_threshold: z.number().optional().describe('Similarity threshold for clustering'), + }, async ({ min_cluster_size, similarity_threshold }) => { + try { + const consolidation = await audrey.consolidate({ + minClusterSize: min_cluster_size, + similarityThreshold: similarity_threshold, + }); + return toolResult(consolidation); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_introspect', {}, async () => { + try { + return toolResult(audrey.introspect()); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_resolve_truth', { + contradiction_id: z.string().describe('ID of the contradiction to resolve'), + }, async ({ contradiction_id }) => { + try { + return toolResult(await audrey.resolveTruth(contradiction_id)); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_export', {}, async () => { + try { + return toolResult(audrey.export()); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_import', memoryImportToolSchema, async ({ snapshot }) => { + try { + await audrey.import(snapshot as Parameters[0]); + return toolResult({ imported: true, stats: audrey.introspect() }); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_forget', memoryForgetToolSchema, async ({ id, query, min_similarity, purge }) => { + try { + validateForgetSelection(id, query); + let result; + if (id) { + result = audrey.forget(id, { purge: purge ?? false }); + } else { + result = await audrey.forgetByQuery(query!, { + minSimilarity: min_similarity ?? 0.9, + purge: purge ?? false, + }); + if (!result) { + return toolResult({ forgotten: false, reason: 'No memory found above similarity threshold' }); + } + } + return toolResult({ forgotten: true, ...result }); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_decay', { + dormant_threshold: z.number().min(0).max(1).optional().describe('Confidence below which memories go dormant (default 0.1)'), + }, async ({ dormant_threshold }) => { + try { + return toolResult(audrey.decay({ dormantThreshold: dormant_threshold })); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_status', {}, async () => { + try { + return toolResult(audrey.memoryStatus()); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_reflect', { + turns: z.array(z.object({ + role: z.string().describe('Message role: user or assistant'), + content: z.string().describe('Message content'), + })).describe('Conversation turns to reflect on. Call at end of meaningful conversations to form lasting memories.'), + }, async ({ turns }) => { + try { + return toolResult(await audrey.reflect(turns)); + } catch (err) { + return toolError(err); + } + }); + + registerDreamTool(server, audrey); + + server.tool('memory_greeting', { + context: z.string().optional().describe( + 'Optional hint about this session. When provided, Audrey also returns semantically relevant memories.' + ), + }, async ({ context }) => { + try { + return toolResult(await audrey.greeting({ context })); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_observe_tool', { + event: z.string().describe( + 'Hook event name (PreToolUse, PostToolUse, PostToolUseFailure, PreCompact, PostCompact, etc.)' + ), + tool: z.string().describe('Tool name being observed (Bash, Edit, Write, etc.)'), + session_id: z.string().optional().describe('Session identifier for grouping related events'), + input: z.unknown().optional().describe( + 'Tool input. Hashed and never stored raw; redacted metadata is only stored when retain_details is true.' + ), + output: z.unknown().optional().describe('Tool output. Same redaction and storage policy as input.'), + outcome: z.enum(['succeeded', 'failed', 'blocked', 'skipped', 'unknown']).optional().describe('Outcome classification'), + error_summary: z.string().optional().describe('Short error description if the tool failed. Redacted and truncated to 2 KB.'), + cwd: z.string().optional().describe('Working directory at the time of the tool call'), + files: z.array(z.string()).optional().describe('File paths to fingerprint (size + mtime + content hash)'), + metadata: z.record(z.string(), z.unknown()).optional().describe('Arbitrary structured metadata (redacted before storage)'), + retain_details: z.boolean().optional().describe( + 'If true, redacted input and output payloads are stored alongside hashes. Defaults to false.' + ), + }, async ({ + event, + tool, + session_id, + input, + output, + outcome, + error_summary, + cwd, + files, + metadata, + retain_details, + }) => { + try { + const result = audrey.observeTool({ + event, + tool, + sessionId: session_id, + input, + output, + outcome, + errorSummary: error_summary, + cwd, + files, + metadata, + retainDetails: retain_details, + }); + return toolResult({ + id: result.event.id, + event_type: result.event.event_type, + tool_name: result.event.tool_name, + outcome: result.event.outcome, + redaction_state: result.event.redaction_state, + redactions: result.redactions, + created_at: result.event.created_at, + }); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_recent_failures', { + since: z.string().optional().describe('ISO timestamp lower bound (defaults to 7 days ago)'), + limit: z.number().int().min(1).max(200).optional().describe('Max rows to return (defaults to 20)'), + }, async ({ since, limit }) => { + try { + return toolResult(audrey.recentFailures({ since, limit })); + } catch (err) { + return toolError(err); + } + }); + + server.tool('memory_capsule', { + query: z.string().describe('Natural-language query for the turn. Drives what gets surfaced.'), + limit: z.number().int().min(1).max(50).optional().describe('Max recall results to consider before categorization.'), + budget_chars: z.number().int().min(200).max(32000).optional().describe( + 'Token budget in characters (defaults to AUDREY_CONTEXT_BUDGET_CHARS or 4000).' + ), + mode: z.enum(['balanced', 'conservative', 'aggressive']).optional().describe( + 'Capsule mode: conservative = fewer, higher-confidence entries; aggressive = broader sweep.' + ), + recent_change_window_hours: z.number().int().min(1).max(720).optional().describe('How far back "recent_changes" looks (default 24h).'), + include_risks: z.boolean().optional().describe('Include recent tool failures as risks (default true).'), + include_contradictions: z.boolean().optional().describe('Include open contradictions (default true).'), + }, async ({ + query, + limit, + budget_chars, + mode, + recent_change_window_hours, + include_risks, + include_contradictions, + }) => { + try { + const capsule = await audrey.capsule(query, { + limit, + budgetChars: budget_chars, + mode, + recentChangeWindowHours: recent_change_window_hours, + includeRisks: include_risks, + includeContradictions: include_contradictions, + }); + return toolResult(capsule); + } catch (err) { + return toolError(err); + } }); - server.tool('memory_observe_tool', { - event: z.string().describe('Hook event name (PreToolUse, PostToolUse, PostToolUseFailure, PreCompact, PostCompact, etc.)'), - tool: z.string().describe('Tool name being observed (Bash, Edit, Write, etc.)'), - session_id: z.string().optional().describe('Session identifier for grouping related events'), - input: z.unknown().optional().describe('Tool input. Hashed and never stored raw; redacted + summarized into metadata only when retain_details is true.'), - output: z.unknown().optional().describe('Tool output. Same redaction and storage policy as input.'), - outcome: z.enum(['succeeded', 'failed', 'blocked', 'skipped', 'unknown']).optional().describe('Outcome classification'), - error_summary: z.string().optional().describe('Short error description if the tool failed. Redacted and truncated to 2 KB.'), - cwd: z.string().optional().describe('Working directory at the time of the tool call'), - files: z.array(z.string()).optional().describe('File paths to fingerprint (size + mtime + content hash)'), - metadata: z.record(z.string(), z.unknown()).optional().describe('Arbitrary structured metadata (redacted before storage)'), - retain_details: z.boolean().optional().describe('If true, redacted input and output payloads are stored alongside hashes. Defaults to false.'), - }, async ({ event, tool, session_id, input, output, outcome, error_summary, cwd, files, metadata, retain_details }) => { + server.tool('memory_preflight', memoryPreflightToolSchema, async ({ + action, + tool, + session_id, + cwd, + files, + strict, + limit, + budget_chars, + mode, + failure_window_hours, + include_status, + record_event, + include_capsule, + }) => { try { - const result = audrey.observeTool({ - event, + const preflight = await audrey.preflight(action, { tool, sessionId: session_id, - input, - output, - outcome, - errorSummary: error_summary, cwd, files, - metadata, - retainDetails: retain_details, - }); - return toolResult({ - id: result.event.id, - event_type: result.event.event_type, - tool_name: result.event.tool_name, - outcome: result.event.outcome, - redaction_state: result.event.redaction_state, - redactions: result.redactions, - created_at: result.event.created_at, - }); - } catch (err) { - return toolError(err); - } - }); - - server.tool('memory_recent_failures', { - since: z.string().optional().describe('ISO timestamp lower bound (defaults to 7 days ago)'), - limit: z.number().int().min(1).max(200).optional().describe('Max rows to return (defaults to 20)'), - }, async ({ since, limit }) => { - try { - return toolResult(audrey.recentFailures({ since, limit })); - } catch (err) { - return toolError(err); - } - }); - - server.tool('memory_capsule', { - query: z.string().describe('Natural-language query for the turn. Drives what gets surfaced.'), - limit: z.number().int().min(1).max(50).optional().describe('Max recall results to consider before categorization.'), - budget_chars: z.number().int().min(200).max(32000).optional().describe('Token budget in characters (defaults to AUDREY_CONTEXT_BUDGET_CHARS or 4000).'), - mode: z.enum(['balanced', 'conservative', 'aggressive']).optional().describe('Capsule mode: conservative = fewer, higher-confidence entries; aggressive = broader sweep.'), - recent_change_window_hours: z.number().int().min(1).max(720).optional().describe('How far back "recent_changes" looks (default 24h).'), - include_risks: z.boolean().optional().describe('Include recent tool failures as risks (default true).'), - include_contradictions: z.boolean().optional().describe('Include open contradictions (default true).'), - }, async ({ query, limit, budget_chars, mode, recent_change_window_hours, include_risks, include_contradictions }) => { - try { - const capsule = await audrey.capsule(query, { + strict, limit, budgetChars: budget_chars, mode, - recentChangeWindowHours: recent_change_window_hours, - includeRisks: include_risks, - includeContradictions: include_contradictions, + recentFailureWindowHours: failure_window_hours, + includeStatus: include_status, + recordEvent: record_event, + includeCapsule: include_capsule, }); - return toolResult(capsule); + return toolResult(preflight); } catch (err) { return toolError(err); } }); - server.tool('memory_promote', { - target: z.enum(['claude-rules']).optional().describe('Promotion target. Only claude-rules is implemented in PR 4 v1. AGENTS.md / playbook / hooks / checklist targets land in PR 4.1+.'), - min_confidence: z.number().min(0).max(1).optional().describe('Minimum memory confidence for promotion (default 0.7 for procedural, 0.8 for semantic).'), - min_evidence: z.number().int().min(1).optional().describe('Minimum supporting episode count (default 2).'), - limit: z.number().int().min(1).max(50).optional().describe('Max candidates to return/apply (default 20).'), - dry_run: z.boolean().optional().describe('If true (default), return candidates without writing. Pair with yes=true to actually write.'), - yes: z.boolean().optional().describe('Confirm write. Without this or dry_run=false the command stays in dry-run mode.'), - project_dir: z.string().optional().describe('Absolute path to the project root where .claude/rules/ should be created. Defaults to process.cwd().'), - }, async ({ target, min_confidence, min_evidence, limit, dry_run, yes, project_dir }) => { + server.tool('memory_reflexes', memoryReflexesToolSchema, async ({ + action, + tool, + session_id, + cwd, + files, + strict, + limit, + budget_chars, + mode, + failure_window_hours, + include_status, + record_event, + include_capsule, + include_preflight, + }) => { try { - const result = await audrey.promote({ - target, - minConfidence: min_confidence, - minEvidence: min_evidence, + const report = await audrey.reflexes(action, { + tool, + sessionId: session_id, + cwd, + files, + strict, limit, - dryRun: dry_run, - yes, - projectDir: project_dir, + budgetChars: budget_chars, + mode, + recentFailureWindowHours: failure_window_hours, + includeStatus: include_status, + recordEvent: record_event, + includeCapsule: include_capsule, + includePreflight: include_preflight, }); - return toolResult(result); + return toolResult(report); } catch (err) { return toolError(err); } }); - const transport = new StdioServerTransport(); - await server.connect(transport); - console.error('[audrey-mcp] connected via stdio'); - registerShutdownHandlers(process, audrey); -} - -function parseObserveToolArgs(argv: string[]): { - event?: string; - tool?: string; - sessionId?: string; - outcome?: string; - cwd?: string; - errorSummary?: string; - files?: string[]; - inputJson?: string; - outputJson?: string; - metadataJson?: string; - retainDetails?: boolean; -} { - const out: Record = {}; - for (let i = 0; i < argv.length; i++) { - const token = argv[i]; - const next = () => argv[++i]; - if (token === '--event') out.event = next(); - else if (token === '--tool') out.tool = next(); - else if (token === '--session-id') out.sessionId = next(); - else if (token === '--outcome') out.outcome = next(); - else if (token === '--cwd') out.cwd = next(); - else if (token === '--error-summary') out.errorSummary = next(); - else if (token === '--files') { - const list = next(); - if (list) out.files = list.split(',').map(s => s.trim()).filter(Boolean); - } - else if (token === '--input-json') out.inputJson = next(); - else if (token === '--output-json') out.outputJson = next(); - else if (token === '--metadata-json') out.metadataJson = next(); - else if (token === '--retain-details') out.retainDetails = true; - } - return out as ReturnType; -} - -async function observeToolCli(): Promise { - const args = parseObserveToolArgs(process.argv.slice(3)); - - let stdinPayload: Record | null = null; - if (!process.stdin.isTTY) { - const chunks: Buffer[] = []; - for await (const chunk of process.stdin) chunks.push(chunk as Buffer); - const raw = Buffer.concat(chunks).toString('utf-8').trim(); - if (raw) { - try { stdinPayload = JSON.parse(raw) as Record; } - catch { console.error('[audrey] observe-tool: stdin was not valid JSON, ignoring.'); } - } - } - - // Auto-extract common fields from the Claude Code hook payload so the hook - // config can be minimal: only --event needs to be specified on the command - // line; tool_name / session_id / cwd / hook_event_name come from stdin. - const effectiveEvent = args.event ?? (stdinPayload?.hook_event_name as string | undefined); - const effectiveTool = args.tool ?? (stdinPayload?.tool_name as string | undefined); - - if (!effectiveEvent) { - console.error('[audrey] observe-tool: --event is required (or provide hook_event_name in stdin JSON)'); - process.exit(2); - } - if (!effectiveTool) { - console.error('[audrey] observe-tool: --tool is required (or provide tool_name in stdin JSON)'); - process.exit(2); - } - - const parseMaybeJson = (text: string | undefined): unknown => { - if (text == null) return undefined; - try { return JSON.parse(text); } - catch { return text; } - }; - - const inputPayload = args.inputJson !== undefined - ? parseMaybeJson(args.inputJson) - : stdinPayload?.tool_input ?? stdinPayload?.input; - const outputPayload = args.outputJson !== undefined - ? parseMaybeJson(args.outputJson) - : stdinPayload?.tool_response ?? stdinPayload?.tool_output ?? stdinPayload?.output; - const metadataPayload = args.metadataJson !== undefined - ? parseMaybeJson(args.metadataJson) - : stdinPayload?.metadata; - - const sessionId = args.sessionId ?? (stdinPayload?.session_id as string | undefined); - const cwd = args.cwd ?? (stdinPayload?.cwd as string | undefined); - - // Detect failure from Claude Code hook payload shape: tool_response often - // includes a non-empty error or a success=false flag for failed tools. - let outcome = args.outcome as 'succeeded' | 'failed' | 'blocked' | 'skipped' | 'unknown' | undefined; - let errorSummary = args.errorSummary ?? (stdinPayload?.error_summary as string | undefined); - if (outcome == null && effectiveEvent === 'PostToolUse') { - const resp = (stdinPayload?.tool_response as Record | undefined) ?? undefined; - const errField = resp?.['error'] ?? resp?.['stderr']; - const successField = resp?.['success']; - if (typeof successField === 'boolean') { - outcome = successField ? 'succeeded' : 'failed'; - } else if (errField && (typeof errField === 'string' ? errField.length > 0 : true)) { - outcome = 'failed'; - } else { - outcome = 'succeeded'; - } - if (outcome === 'failed' && !errorSummary) { - errorSummary = typeof errField === 'string' ? errField : JSON.stringify(errField ?? resp); - } - } - - const dataDir = resolveDataDir(process.env); - const embedding = resolveEmbeddingProvider(process.env, process.env['AUDREY_EMBEDDING_PROVIDER']); - const audrey = new Audrey({ - dataDir, - agent: process.env['AUDREY_AGENT'] ?? 'observe-tool', - embedding, - }); - - try { - const result = audrey.observeTool({ - event: effectiveEvent, - tool: effectiveTool, - sessionId, - input: inputPayload, - output: outputPayload, - outcome, - errorSummary, - cwd, - files: args.files, - metadata: (metadataPayload ?? undefined) as Record | undefined, - retainDetails: args.retainDetails, - }); - const summary = { - id: result.event.id, - event_type: result.event.event_type, - tool_name: result.event.tool_name, - outcome: result.event.outcome, - redaction_state: result.event.redaction_state, - redactions: result.redactions, - }; - console.log(JSON.stringify(summary)); - } finally { - audrey.close(); - } -} - -function parsePromoteArgs(argv: string[]): { - target?: 'claude-rules' | 'agents-md' | 'playbook' | 'hook' | 'checklist'; - minConfidence?: number; - minEvidence?: number; - limit?: number; - dryRun?: boolean; - yes?: boolean; - projectDir?: string; - json?: boolean; -} { - const out: Record = {}; - for (let i = 0; i < argv.length; i++) { - const token = argv[i]; - const next = () => argv[++i]; - if (token === '--target') out.target = next(); - else if (token === '--min-confidence') out.minConfidence = Number.parseFloat(next() ?? ''); - else if (token === '--min-evidence') out.minEvidence = Number.parseInt(next() ?? '', 10); - else if (token === '--limit') out.limit = Number.parseInt(next() ?? '', 10); - else if (token === '--dry-run') out.dryRun = true; - else if (token === '--yes' || token === '-y') out.yes = true; - else if (token === '--project-dir') out.projectDir = next(); - else if (token === '--json') out.json = true; - } - return out as ReturnType; -} - -async function promoteCli(): Promise { - const args = parsePromoteArgs(process.argv.slice(3)); - - const dataDir = resolveDataDir(process.env); - const embedding = resolveEmbeddingProvider(process.env, process.env['AUDREY_EMBEDDING_PROVIDER']); - const audrey = new Audrey({ - dataDir, - agent: process.env['AUDREY_AGENT'] ?? 'promote', - embedding, - }); - - try { - const result = await audrey.promote({ - target: args.target as 'claude-rules' | undefined, - minConfidence: args.minConfidence, - minEvidence: args.minEvidence, - limit: args.limit, - dryRun: args.dryRun ?? !args.yes, - yes: args.yes, - projectDir: args.projectDir, - }); - - if (args.json) { - console.log(JSON.stringify(result, null, 2)); - return; - } - + server.tool('memory_promote', { + target: z.enum(['claude-rules']).optional().describe( + 'Promotion target. Only claude-rules is implemented in PR 4 v1.' + ), + min_confidence: z.number().min(0).max(1).optional().describe( + 'Minimum memory confidence for promotion (default 0.7 for procedural, 0.8 for semantic).' + ), + min_evidence: z.number().int().min(1).optional().describe('Minimum supporting episode count (default 2).'), + limit: z.number().int().min(1).max(50).optional().describe('Max candidates to return/apply (default 20).'), + dry_run: z.boolean().optional().describe('If true (default), return candidates without writing. Pair with yes=true to actually write.'), + yes: z.boolean().optional().describe('Confirm write. Without this or dry_run=false the command stays in dry-run mode.'), + project_dir: z.string().optional().describe( + 'Absolute path to the project root where .claude/rules/ should be created. Defaults to process.cwd().' + ), + }, async ({ + target, + min_confidence, + min_evidence, + limit, + dry_run, + yes, + project_dir, + }) => { + try { + const result = await audrey.promote({ + target, + minConfidence: min_confidence, + minEvidence: min_evidence, + limit, + dryRun: dry_run, + yes, + projectDir: project_dir, + }); + return toolResult(result); + } catch (err) { + return toolError(err); + } + }); + + const transport = new StdioServerTransport(); + await server.connect(transport); + console.error('[audrey-mcp] connected via stdio'); + registerShutdownHandlers(process, audrey); +} + +function parseObserveToolArgs(argv: string[]): { + event?: string; + tool?: string; + sessionId?: string; + outcome?: string; + cwd?: string; + errorSummary?: string; + files?: string[]; + inputJson?: string; + outputJson?: string; + metadataJson?: string; + retainDetails?: boolean; +} { + const out: Record = {}; + for (let i = 0; i < argv.length; i++) { + const token = argv[i]; + const next = () => argv[++i]; + if (token === '--event') out.event = next(); + else if (token === '--tool') out.tool = next(); + else if (token === '--session-id') out.sessionId = next(); + else if (token === '--outcome') out.outcome = next(); + else if (token === '--cwd') out.cwd = next(); + else if (token === '--error-summary') out.errorSummary = next(); + else if (token === '--files') { + const list = next(); + if (list) out.files = list.split(',').map(s => s.trim()).filter(Boolean); + } + else if (token === '--input-json') out.inputJson = next(); + else if (token === '--output-json') out.outputJson = next(); + else if (token === '--metadata-json') out.metadataJson = next(); + else if (token === '--retain-details') out.retainDetails = true; + } + return out as ReturnType; +} + +async function observeToolCli(): Promise { + const args = parseObserveToolArgs(process.argv.slice(3)); + + let stdinPayload: Record | null = null; + if (!process.stdin.isTTY) { + const chunks: Buffer[] = []; + for await (const chunk of process.stdin) chunks.push(chunk as Buffer); + const raw = Buffer.concat(chunks).toString('utf-8').trim(); + if (raw) { + try { stdinPayload = JSON.parse(raw) as Record; } + catch { console.error('[audrey] observe-tool: stdin was not valid JSON, ignoring.'); } + } + } + + // Auto-extract common fields from the Claude Code hook payload so the hook + // config can be minimal: only --event needs to be specified on the command + // line; tool_name / session_id / cwd / hook_event_name come from stdin. + const effectiveEvent = args.event ?? (stdinPayload?.hook_event_name as string | undefined); + const effectiveTool = args.tool ?? (stdinPayload?.tool_name as string | undefined); + + if (!effectiveEvent) { + console.error('[audrey] observe-tool: --event is required (or provide hook_event_name in stdin JSON)'); + process.exit(2); + } + if (!effectiveTool) { + console.error('[audrey] observe-tool: --tool is required (or provide tool_name in stdin JSON)'); + process.exit(2); + } + + const parseMaybeJson = (text: string | undefined): unknown => { + if (text == null) return undefined; + try { return JSON.parse(text); } + catch { return text; } + }; + + const inputPayload = args.inputJson !== undefined + ? parseMaybeJson(args.inputJson) + : stdinPayload?.tool_input ?? stdinPayload?.input; + const outputPayload = args.outputJson !== undefined + ? parseMaybeJson(args.outputJson) + : stdinPayload?.tool_response ?? stdinPayload?.tool_output ?? stdinPayload?.output; + const metadataPayload = args.metadataJson !== undefined + ? parseMaybeJson(args.metadataJson) + : stdinPayload?.metadata; + + const sessionId = args.sessionId ?? (stdinPayload?.session_id as string | undefined); + const cwd = args.cwd ?? (stdinPayload?.cwd as string | undefined); + + // Detect failure from Claude Code hook payload shape: tool_response often + // includes a non-empty error or a success=false flag for failed tools. + let outcome = args.outcome as 'succeeded' | 'failed' | 'blocked' | 'skipped' | 'unknown' | undefined; + let errorSummary = args.errorSummary ?? (stdinPayload?.error_summary as string | undefined); + if (outcome == null && effectiveEvent === 'PostToolUse') { + const resp = (stdinPayload?.tool_response as Record | undefined) ?? undefined; + const errField = resp?.['error'] ?? resp?.['stderr']; + const successField = resp?.['success']; + if (typeof successField === 'boolean') { + outcome = successField ? 'succeeded' : 'failed'; + } else if (errField && (typeof errField === 'string' ? errField.length > 0 : true)) { + outcome = 'failed'; + } else { + outcome = 'succeeded'; + } + if (outcome === 'failed' && !errorSummary) { + errorSummary = typeof errField === 'string' ? errField : JSON.stringify(errField ?? resp); + } + } + + const dataDir = resolveDataDir(process.env); + const embedding = resolveEmbeddingProvider(process.env, process.env['AUDREY_EMBEDDING_PROVIDER']); + const audrey = new Audrey({ + dataDir, + agent: process.env['AUDREY_AGENT'] ?? 'observe-tool', + embedding, + }); + + try { + const result = audrey.observeTool({ + event: effectiveEvent, + tool: effectiveTool, + sessionId, + input: inputPayload, + output: outputPayload, + outcome, + errorSummary, + cwd, + files: args.files, + metadata: (metadataPayload ?? undefined) as Record | undefined, + retainDetails: args.retainDetails, + }); + const summary = { + id: result.event.id, + event_type: result.event.event_type, + tool_name: result.event.tool_name, + outcome: result.event.outcome, + redaction_state: result.event.redaction_state, + redactions: result.redactions, + }; + console.log(JSON.stringify(summary)); + } finally { + audrey.close(); + } +} + +function parsePromoteArgs(argv: string[]): { + target?: 'claude-rules' | 'agents-md' | 'playbook' | 'hook' | 'checklist'; + minConfidence?: number; + minEvidence?: number; + limit?: number; + dryRun?: boolean; + yes?: boolean; + projectDir?: string; + json?: boolean; +} { + const out: Record = {}; + for (let i = 0; i < argv.length; i++) { + const token = argv[i]; + const next = () => argv[++i]; + if (token === '--target') out.target = next(); + else if (token === '--min-confidence') out.minConfidence = Number.parseFloat(next() ?? ''); + else if (token === '--min-evidence') out.minEvidence = Number.parseInt(next() ?? '', 10); + else if (token === '--limit') out.limit = Number.parseInt(next() ?? '', 10); + else if (token === '--dry-run') out.dryRun = true; + else if (token === '--yes' || token === '-y') out.yes = true; + else if (token === '--project-dir') out.projectDir = next(); + else if (token === '--json') out.json = true; + } + return out as ReturnType; +} + +async function promoteCli(): Promise { + const args = parsePromoteArgs(process.argv.slice(3)); + + const dataDir = resolveDataDir(process.env); + const embedding = resolveEmbeddingProvider(process.env, process.env['AUDREY_EMBEDDING_PROVIDER']); + const audrey = new Audrey({ + dataDir, + agent: process.env['AUDREY_AGENT'] ?? 'promote', + embedding, + }); + + try { + const result = await audrey.promote({ + target: args.target as 'claude-rules' | undefined, + minConfidence: args.minConfidence, + minEvidence: args.minEvidence, + limit: args.limit, + dryRun: args.dryRun ?? !args.yes, + yes: args.yes, + projectDir: args.projectDir, + }); + + if (args.json) { + console.log(JSON.stringify(result, null, 2)); + return; + } + + const candidateLabel = `${result.candidates.length} candidate${result.candidates.length === 1 ? '' : 's'}`; + const appliedLabel = `${result.applied.length} rule${result.applied.length === 1 ? '' : 's'}`; const header = result.dry_run - ? `[audrey] promote (dry-run) — ${result.candidates.length} candidate${result.candidates.length === 1 ? '' : 's'} for target "${result.target}"` - : `[audrey] promote — wrote ${result.applied.length} rule${result.applied.length === 1 ? '' : 's'} to ${result.project_dir}`; - console.log(header); - if (result.candidates.length === 0) { - console.log(' (no candidates met the confidence/evidence thresholds)'); - return; - } - for (const c of result.candidates) { - console.log(''); - console.log(` ${c.rendered_path} [score ${c.score.toFixed(1)}]`); - const snippet = c.content.length > 120 ? c.content.slice(0, 117) + '…' : c.content; - console.log(` memory: ${snippet}`); - console.log(` why: ${c.reason}`); - console.log(` confidence=${(c.confidence * 100).toFixed(1)}% evidence=${c.evidence_count} prevented_failures=${c.failure_prevented}`); - } - if (result.dry_run) { - console.log(''); - console.log(' Re-run with --yes to write these rules to disk.'); - } - } finally { - audrey.close(); - } -} - -const isDirectRun = process.argv[1] && resolve(process.argv[1]) === fileURLToPath(import.meta.url); - -if (isDirectRun) { + ? `[audrey] promote (dry-run) - ${candidateLabel} for target "${result.target}"` + : `[audrey] promote - wrote ${appliedLabel} to ${result.project_dir}`; + console.log(header); + if (result.candidates.length === 0) { + console.log(' (no candidates met the confidence/evidence thresholds)'); + return; + } + for (const c of result.candidates) { + console.log(''); + console.log(` ${c.rendered_path} [score ${c.score.toFixed(1)}]`); + const snippet = c.content.length > 120 ? c.content.slice(0, 117) + '...' : c.content; + console.log(` memory: ${snippet}`); + console.log(` why: ${c.reason}`); + console.log( + ` confidence=${(c.confidence * 100).toFixed(1)}% ` + + `evidence=${c.evidence_count} prevented_failures=${c.failure_prevented}` + ); + } + if (result.dry_run) { + console.log(''); + console.log(' Re-run with --yes to write these rules to disk.'); + } + } finally { + audrey.close(); + } +} + +const isDirectRun = process.argv[1] && resolve(process.argv[1]) === fileURLToPath(import.meta.url); + +if (isDirectRun) { if (subcommand === 'install') { install(); } else if (subcommand === 'uninstall') { uninstall(); - } else if (subcommand === 'reembed') { - reembed().catch(err => { - console.error('[audrey] reembed failed:', err); - process.exit(1); - }); - } else if (subcommand === 'dream') { - dream().catch(err => { - console.error('[audrey] dream failed:', err); - process.exit(1); - }); - } else if (subcommand === 'greeting') { - greeting().catch(err => { - console.error('[audrey] greeting failed:', err); - process.exit(1); - }); - } else if (subcommand === 'reflect') { - reflect().catch(err => { - console.error('[audrey] reflect failed:', err); + } else if (subcommand === 'mcp-config') { + printMcpConfig(); + } else if (subcommand === 'demo') { + runDemoCommand().catch(err => { + console.error('[audrey] demo failed:', err); process.exit(1); }); - } else if (subcommand === 'serve') { - serveHttp().catch(err => { - console.error('[audrey] serve failed:', err); - process.exit(1); - }); - } else if (subcommand === 'status') { - status(); - } else if (subcommand === 'observe-tool') { - observeToolCli().catch(err => { - console.error('[audrey] observe-tool failed:', err); - process.exit(1); - }); - } else if (subcommand === 'promote') { - promoteCli().catch(err => { - console.error('[audrey] promote failed:', err); - process.exit(1); - }); - } else { - main().catch(err => { - console.error('[audrey-mcp] fatal:', err); - process.exit(1); - }); - } -} + } else if (subcommand === 'reembed') { + reembed().catch(err => { + console.error('[audrey] reembed failed:', err); + process.exit(1); + }); + } else if (subcommand === 'dream') { + dream().catch(err => { + console.error('[audrey] dream failed:', err); + process.exit(1); + }); + } else if (subcommand === 'greeting') { + greeting().catch(err => { + console.error('[audrey] greeting failed:', err); + process.exit(1); + }); + } else if (subcommand === 'reflect') { + reflect().catch(err => { + console.error('[audrey] reflect failed:', err); + process.exit(1); + }); + } else if (subcommand === 'serve') { + serveHttp().catch(err => { + console.error('[audrey] serve failed:', err); + process.exit(1); + }); + } else if (subcommand === 'status') { + status(); + } else if (subcommand === 'observe-tool') { + observeToolCli().catch(err => { + console.error('[audrey] observe-tool failed:', err); + process.exit(1); + }); + } else if (subcommand === 'promote') { + promoteCli().catch(err => { + console.error('[audrey] promote failed:', err); + process.exit(1); + }); + } else { + main().catch(err => { + console.error('[audrey-mcp] fatal:', err); + process.exit(1); + }); + } +} diff --git a/package-lock.json b/package-lock.json index c5764c2..de0f6b7 100644 --- a/package-lock.json +++ b/package-lock.json @@ -2519,9 +2519,9 @@ } }, "node_modules/hono": { - "version": "4.12.12", - "resolved": "https://registry.npmjs.org/hono/-/hono-4.12.12.tgz", - "integrity": "sha512-p1JfQMKaceuCbpJKAPKVqyqviZdS0eUxH9v82oWo1kb9xjQ5wA6iP3FNVAPDFlz5/p7d45lO+BpSk1tuSZMF4Q==", + "version": "4.12.14", + "resolved": "https://registry.npmjs.org/hono/-/hono-4.12.14.tgz", + "integrity": "sha512-am5zfg3yu6sqn5yjKBNqhnTX7Cv+m00ox+7jbaKkrLMRJ4rAdldd1xPd/JzbBWspqaQv6RSTrgFN95EsfhC+7w==", "license": "MIT", "engines": { "node": ">=16.9.0" @@ -3066,9 +3066,9 @@ } }, "node_modules/protobufjs": { - "version": "7.5.4", - "resolved": "https://registry.npmjs.org/protobufjs/-/protobufjs-7.5.4.tgz", - "integrity": "sha512-CvexbZtbov6jW2eXAvLukXjXUW1TzFaivC46BpWc/3BpcCysb5Vffu+B3XHMm8lVEuy2Mm4XGex8hBSg1yapPg==", + "version": "7.5.5", + "resolved": "https://registry.npmjs.org/protobufjs/-/protobufjs-7.5.5.tgz", + "integrity": "sha512-3wY1AxV+VBNW8Yypfd1yQY9pXnqTAN+KwQxL8iYm3/BjKYMNg4i0owhEe26PWDOMaIrzeeF98Lqd5NGz4omiIg==", "hasInstallScript": true, "license": "BSD-3-Clause", "dependencies": { diff --git a/package.json b/package.json index 179e1ab..9d9e6cb 100644 --- a/package.json +++ b/package.json @@ -1,7 +1,7 @@ { "name": "audrey", "version": "0.20.0", - "description": "Biological memory architecture for AI agents - encode, consolidate, and recall memories with confidence decay, contradiction detection, and causal graphs", + "description": "Local-first memory runtime for AI agents with recall, consolidation, memory reflexes, contradiction detection, and tool-trace learning", "type": "module", "main": "dist/src/index.js", "types": "dist/src/index.d.ts", @@ -27,6 +27,10 @@ "dist/", "docs/production-readiness.md", "docs/benchmarking.md", + "docs/audrey-for-dummies.md", + "docs/future-of-llm-memory.md", + "docs/mcp-hosts.md", + "docs/ollama-local-agents.md", "docs/assets/benchmarks/", "examples/", "README.md", @@ -76,6 +80,12 @@ "confidence", "long-term-memory", "persistent-memory", + "memory-preflight", + "memory-reflexes", + "agent-reflexes", + "agent-safety", + "tool-trace-memory", + "local-first-memory", "rag", "claude", "agent-framework", @@ -112,5 +122,10 @@ "@types/node": "^25.6.0", "typescript": "^6.0.2", "vitest": "^4.0.18" + }, + "directories": { + "doc": "docs", + "example": "examples", + "test": "tests" } } diff --git a/src/audrey.ts b/src/audrey.ts index 2af9a5e..b3f9e5b 100644 --- a/src/audrey.ts +++ b/src/audrey.ts @@ -1,781 +1,795 @@ -import { EventEmitter } from 'node:events'; -import Database from 'better-sqlite3'; -import type { - AudreyConfig, - ConfidenceConfig, - ConsolidationOptions, - ConsolidationResult, - DecayResult, - DreamResult, - EmbeddingProvider, - EncodeParams, - ForgetResult, - GreetingOptions, - GreetingResult, - HalfLives, - IntrospectResult, - LLMProvider, - MemoryStatusResult, - PurgeResult, - RecallOptions, - RecallResult, - ReembedCounts, - ReflectResult, - TruthResolution, - ConsolidationRunRow, - Affect, -} from './types.js'; -import { createDatabase, closeDatabase } from './db.js'; -import { createEmbeddingProvider } from './embedding.js'; -import { createLLMProvider } from './llm.js'; -import { encodeEpisode } from './encode.js'; -import { recall as recallFn, recallStream as recallStreamFn } from './recall.js'; -import { validateMemory } from './validate.js'; -import { runConsolidation } from './consolidate.js'; -import { applyDecay } from './decay.js'; -import { rollbackConsolidation, getConsolidationHistory } from './rollback.js'; -import { forgetMemory, forgetByQuery as forgetByQueryFn, purgeMemories } from './forget.js'; -import { introspect as introspectFn } from './introspect.js'; -import { buildContextResolutionPrompt, buildReflectionPrompt } from './prompts.js'; -import { exportMemories } from './export.js'; -import { importMemories } from './import.js'; -import { suggestConsolidationParams as suggestParamsFn } from './adaptive.js'; -import { reembedAll } from './migrate.js'; -import { applyInterference } from './interference.js'; -import { detectResonance } from './affect.js'; -import { observeTool, type ObserveToolInput, type ObserveToolResult } from './tool-trace.js'; -import { - listEvents, - countEvents, - recentFailures, - type EventQuery, - type FailurePattern, - type MemoryEvent, -} from './events.js'; +import { EventEmitter } from 'node:events'; +import Database from 'better-sqlite3'; +import type { + AudreyConfig, + ConfidenceConfig, + ConsolidationOptions, + ConsolidationResult, + DecayResult, + DreamResult, + EmbeddingProvider, + EncodeParams, + ForgetResult, + GreetingOptions, + GreetingResult, + HalfLives, + IntrospectResult, + LLMProvider, + MemoryStatusResult, + PurgeResult, + RecallOptions, + RecallResult, + ReembedCounts, + ReflectResult, + TruthResolution, + ConsolidationRunRow, + Affect, +} from './types.js'; +import { createDatabase, closeDatabase } from './db.js'; +import { createEmbeddingProvider } from './embedding.js'; +import { createLLMProvider } from './llm.js'; +import { encodeEpisode } from './encode.js'; +import { recall as recallFn, recallStream as recallStreamFn } from './recall.js'; +import { validateMemory } from './validate.js'; +import { runConsolidation } from './consolidate.js'; +import { applyDecay } from './decay.js'; +import { rollbackConsolidation, getConsolidationHistory } from './rollback.js'; +import { forgetMemory, forgetByQuery as forgetByQueryFn, purgeMemories } from './forget.js'; +import { introspect as introspectFn } from './introspect.js'; +import { buildContextResolutionPrompt, buildReflectionPrompt } from './prompts.js'; +import { exportMemories } from './export.js'; +import { importMemories } from './import.js'; +import { suggestConsolidationParams as suggestParamsFn } from './adaptive.js'; +import { reembedAll } from './migrate.js'; +import { applyInterference } from './interference.js'; +import { detectResonance } from './affect.js'; +import { observeTool, type ObserveToolInput, type ObserveToolResult } from './tool-trace.js'; +import { + listEvents, + countEvents, + recentFailures, + type EventQuery, + type FailurePattern, + type MemoryEvent, +} from './events.js'; import { buildCapsule, type CapsuleOptions, type MemoryCapsule } from './capsule.js'; +import { buildPreflight, type MemoryPreflight, type PreflightOptions } from './preflight.js'; +import { buildReflexReport, type MemoryReflexReport, type ReflexOptions } from './reflexes.js'; import { findPromotionCandidates, type FindCandidatesOptions, - type PromotionCandidate, - type PromotionTarget, -} from './promote.js'; -import { renderAllRules, type RuleDoc } from './rules-compiler.js'; -import { insertEvent } from './events.js'; -import { mkdirSync, writeFileSync, existsSync } from 'node:fs'; -import { dirname, join, resolve as pathResolve } from 'node:path'; - -interface ConfigRow { - value: string; -} - -interface CountRow { - c: number; -} - -interface ContentRow { - content: string; -} - -interface StatusRow { - status: string; -} - -interface AffectRow { - affect: string; -} - -interface GreetingEpisodeRow { - id: string; - content: string; - source: string; - tags: string | null; - salience: number; - created_at: string; -} - -interface GreetingPrincipleRow { - id: string; - content: string; - salience: number; - created_at: string; -} - -interface GreetingIdentityRow { - id: string; - content: string; - tags: string | null; - salience: number; - created_at: string; -} - -interface GreetingUnresolvedRow { - id: string; - content: string; - tags: string | null; - salience: number; - created_at: string; -} - -export class Audrey extends EventEmitter { - agent: string; - dataDir: string; - embeddingProvider: EmbeddingProvider; - db: Database.Database; - llmProvider: LLMProvider | null; - confidenceConfig: ConfidenceConfig; - consolidationConfig: { minEpisodes: number }; - decayConfig: { dormantThreshold: number }; - interferenceConfig: { enabled: boolean; k: number; threshold: number; weight: number }; - contextConfig: { enabled: boolean; weight: number }; - affectConfig: { - enabled: boolean; - weight: number; - arousalWeight: number; - resonance: { enabled: boolean; k: number; threshold: number; affectThreshold: number }; - }; - autoReflect: boolean; - - private _migrationPending: boolean; - private _autoConsolidateTimer: ReturnType | null; - private _closed: boolean; - - constructor({ - dataDir = './audrey-data', - agent = 'default', - embedding = { provider: 'mock', dimensions: 64 }, - llm, - confidence = {}, - consolidation = {}, - decay = {}, - interference = {}, - context = {}, - affect = {}, - autoReflect = false, - }: AudreyConfig = {}) { - super(); - - const dormantThreshold = decay.dormantThreshold ?? 0.1; - if (dormantThreshold < 0 || dormantThreshold > 1) { - throw new Error(`dormantThreshold must be between 0 and 1, got: ${dormantThreshold}`); - } - - const minEpisodes = consolidation.minEpisodes ?? 3; - if (!Number.isInteger(minEpisodes) || minEpisodes < 1) { - throw new Error(`minEpisodes must be a positive integer, got: ${minEpisodes}`); - } - - this.agent = agent; - this.dataDir = dataDir; - this.embeddingProvider = createEmbeddingProvider(embedding); - const { db, migrated } = createDatabase(dataDir, { dimensions: this.embeddingProvider.dimensions }); - this.db = db; - this._migrationPending = migrated; - this.llmProvider = llm ? createLLMProvider(llm) : null; - this.confidenceConfig = { - weights: confidence.weights, - halfLives: confidence.halfLives, - sourceReliability: confidence.sourceReliability, - interferenceWeight: interference.weight ?? 0.1, - contextWeight: context.weight ?? 0.3, - affectWeight: affect.weight ?? 0.2, - }; - this.consolidationConfig = { - minEpisodes: consolidation.minEpisodes || 3, - }; - this.decayConfig = { dormantThreshold: decay.dormantThreshold || 0.1 }; - this._autoConsolidateTimer = null; - this._closed = false; - this.interferenceConfig = { - enabled: interference.enabled ?? true, - k: interference.k ?? 5, - threshold: interference.threshold ?? 0.6, - weight: interference.weight ?? 0.1, - }; - this.contextConfig = { - enabled: context.enabled ?? true, - weight: context.weight ?? 0.3, - }; - this.affectConfig = { - enabled: affect.enabled ?? true, - weight: affect.weight ?? 0.2, - arousalWeight: affect.arousalWeight ?? 0.3, - resonance: { - enabled: affect.resonance?.enabled ?? true, - k: affect.resonance?.k ?? 5, - threshold: affect.resonance?.threshold ?? 0.5, - affectThreshold: affect.resonance?.affectThreshold ?? 0.6, - }, - }; - this.autoReflect = autoReflect; - } - - async _ensureMigrated(): Promise { - if (!this._migrationPending) return; - const counts = await reembedAll(this.db, this.embeddingProvider); - this._migrationPending = false; - this.emit('migration', counts); - } - - _emitValidation(id: string, params: EncodeParams): void { - validateMemory(this.db, this.embeddingProvider, { id, ...params }, { - llmProvider: this.llmProvider, - }) - .then(validation => { - if (validation.action === 'reinforced') { - this.emit('reinforcement', { - episodeId: id, - targetId: validation.semanticId, - similarity: validation.similarity, - }); - } else if (validation.action === 'contradiction') { - this.emit('contradiction', { - episodeId: id, - contradictionId: validation.contradictionId, - semanticId: validation.semanticId, - similarity: validation.similarity, - resolution: validation.resolution, - }); - } - }) - .catch(err => this.emit('error', err)); - } - - async encode(params: EncodeParams): Promise { - await this._ensureMigrated(); - const encodeParams = { ...params, arousalWeight: this.affectConfig.arousalWeight }; - const id = await encodeEpisode(this.db, this.embeddingProvider, encodeParams); - this.emit('encode', { id, ...params }); - if (this.interferenceConfig.enabled) { - applyInterference(this.db, this.embeddingProvider, id, params, this.interferenceConfig) - .then(affected => { - if (affected.length > 0) { - this.emit('interference', { episodeId: id, affected }); - } - }) - .catch(err => this.emit('error', err)); - } - if (this.affectConfig.enabled && this.affectConfig.resonance.enabled && params.affect?.valence !== undefined) { - detectResonance(this.db, this.embeddingProvider, id, params, this.affectConfig.resonance) - .then(echoes => { - if (echoes.length > 0) { - this.emit('resonance', { episodeId: id, affect: params.affect, echoes }); - } - }) - .catch(err => this.emit('error', err)); - } - this._emitValidation(id, params); - return id; - } - - async reflect(turns: { role: string; content: string }[]): Promise { - if (!this.llmProvider) return { encoded: 0, memories: [], skipped: 'no llm provider' }; - - const prompt = buildReflectionPrompt(turns); - let raw: string; - try { - raw = await this.llmProvider.chat!(prompt as unknown as string) as string; - } catch (err) { - this.emit('error', err); - return { encoded: 0, memories: [], skipped: 'llm error' }; - } - - let parsed: { memories?: Array<{ content?: string; source?: string; salience?: number; tags?: string[]; private?: boolean; affect?: Affect }> }; - try { - parsed = JSON.parse(raw); - } catch { - return { encoded: 0, memories: [], skipped: 'invalid llm response' }; - } - - const memories = parsed.memories ?? []; - let encoded = 0; - for (const mem of memories) { - if (!mem.content || !mem.source) continue; - try { - await this.encode({ - content: mem.content, - source: mem.source as EncodeParams['source'], - salience: mem.salience, - tags: mem.tags, - private: mem.private ?? false, - affect: mem.affect ?? undefined, - }); - encoded++; - } catch (err) { - this.emit('error', err); - } - } - - return { encoded, memories: memories as ReflectResult['memories'] }; - } - - async encodeBatch(paramsList: EncodeParams[]): Promise { - await this._ensureMigrated(); - const ids: string[] = []; - for (const params of paramsList) { - const id = await encodeEpisode(this.db, this.embeddingProvider, params); - ids.push(id); - this.emit('encode', { id, ...params }); - } - - for (let i = 0; i < ids.length; i++) { - this._emitValidation(ids[i]!, paramsList[i]!); - } - - return ids; - } - - async recall(query: string, options: RecallOptions = {}): Promise { - await this._ensureMigrated(); - return recallFn(this.db, this.embeddingProvider, query, { - ...options, - confidenceConfig: this._recallConfig(options), - }); - } - - async *recallStream(query: string, options: RecallOptions = {}): AsyncGenerator { - await this._ensureMigrated(); - yield* recallStreamFn(this.db, this.embeddingProvider, query, { - ...options, - confidenceConfig: this._recallConfig(options), - }); - } - - _recallConfig(options: RecallOptions): ConfidenceConfig { - let config: ConfidenceConfig = options.confidenceConfig ?? this.confidenceConfig; - if (this.contextConfig.enabled && options.context) { - config = { ...config, retrievalContext: options.context }; - } - if (this.affectConfig.enabled && options.mood) { - config = { ...config, retrievalMood: options.mood }; - } - return config; - } - - async consolidate(options: Partial = {}): Promise { - await this._ensureMigrated(); - const result = await runConsolidation(this.db, this.embeddingProvider, { - minClusterSize: options.minClusterSize || this.consolidationConfig.minEpisodes, - similarityThreshold: options.similarityThreshold || 0.80, - extractPrinciple: options.extractPrinciple, - llmProvider: options.llmProvider || this.llmProvider || undefined, - }); - const run = db_prepare_get_status(this.db, result.runId); - const output = { ...result, status: run?.status || 'completed' }; - this.emit('consolidation', output); - return output; - } - - decay(options: { dormantThreshold?: number; halfLives?: Partial } = {}): DecayResult { - const result = applyDecay(this.db, { - dormantThreshold: options.dormantThreshold || this.decayConfig.dormantThreshold, - halfLives: options.halfLives ?? this.confidenceConfig.halfLives, - }); - this.emit('decay', result); - return result; - } - - rollback(runId: string): { rolledBackMemories: number; restoredEpisodes: number } { - const result = rollbackConsolidation(this.db, runId); - this.emit('rollback', { runId, ...result }); - return result; - } - - async resolveTruth(contradictionId: string): Promise { - if (!this.llmProvider) { - throw new Error('resolveTruth requires an LLM provider'); - } - - const contradiction = this.db.prepare( - 'SELECT * FROM contradictions WHERE id = ?' - ).get(contradictionId) as { claim_a_id: string; claim_a_type: string; claim_b_id: string; claim_b_type: string } | undefined; - if (!contradiction) throw new Error(`Contradiction not found: ${contradictionId}`); - - const claimA = this._loadClaimContent(contradiction.claim_a_id, contradiction.claim_a_type); - const claimB = this._loadClaimContent(contradiction.claim_b_id, contradiction.claim_b_type); - - const messages = buildContextResolutionPrompt(claimA, claimB); - const result = await this.llmProvider.json(messages) as TruthResolution; - - const now = new Date().toISOString(); - const newState = result.resolution === 'context_dependent' ? 'context_dependent' : 'resolved'; - this.db.prepare(` - UPDATE contradictions SET state = ?, resolution = ?, resolved_at = ? - WHERE id = ? - `).run(newState, JSON.stringify(result), now, contradictionId); - - if (result.resolution === 'a_wins' && contradiction.claim_a_type === 'semantic') { - this.db.prepare("UPDATE semantics SET state = 'active' WHERE id = ?").run(contradiction.claim_a_id); - } - if (result.resolution === 'b_wins' && contradiction.claim_b_type === 'semantic') { - this.db.prepare("UPDATE semantics SET state = 'active' WHERE id = ?").run(contradiction.claim_b_id); - } - if (result.resolution === 'context_dependent') { - if (contradiction.claim_a_type === 'semantic' && result.conditions) { - this.db.prepare("UPDATE semantics SET state = 'context_dependent', conditions = ? WHERE id = ?") - .run(JSON.stringify(result.conditions), contradiction.claim_a_id); - } - } - - return result; - } - - _loadClaimContent(claimId: string, claimType: string): string { - if (claimType === 'semantic') { - const row = this.db.prepare('SELECT content FROM semantics WHERE id = ?').get(claimId) as ContentRow | undefined; - if (!row) throw new Error(`Semantic memory not found: ${claimId}`); - return row.content; - } else if (claimType === 'episodic') { - const row = this.db.prepare('SELECT content FROM episodes WHERE id = ?').get(claimId) as ContentRow | undefined; - if (!row) throw new Error(`Episode not found: ${claimId}`); - return row.content; - } - throw new Error(`Unknown claim type: ${claimType}`); - } - - consolidationHistory(): ConsolidationRunRow[] { - return getConsolidationHistory(this.db); - } - - introspect(): IntrospectResult { - return introspectFn(this.db); - } - - memoryStatus(): MemoryStatusResult { - const episodes = (this.db.prepare('SELECT COUNT(*) as c FROM episodes').get() as CountRow).c; - const semantics = (this.db.prepare('SELECT COUNT(*) as c FROM semantics').get() as CountRow).c; - const procedures = (this.db.prepare('SELECT COUNT(*) as c FROM procedures').get() as CountRow).c; - const searchableEpisodes = (this.db.prepare('SELECT COUNT(*) as c FROM episodes WHERE embedding IS NOT NULL').get() as CountRow).c; - const searchableSemantics = (this.db.prepare('SELECT COUNT(*) as c FROM semantics WHERE embedding IS NOT NULL').get() as CountRow).c; - const searchableProcedures = (this.db.prepare('SELECT COUNT(*) as c FROM procedures WHERE embedding IS NOT NULL').get() as CountRow).c; - - let vecEpisodes = 0, vecSemantics = 0, vecProcedures = 0; - try { - vecEpisodes = (this.db.prepare('SELECT COUNT(*) as c FROM vec_episodes').get() as CountRow).c; - vecSemantics = (this.db.prepare('SELECT COUNT(*) as c FROM vec_semantics').get() as CountRow).c; - vecProcedures = (this.db.prepare('SELECT COUNT(*) as c FROM vec_procedures').get() as CountRow).c; - } catch { - // vec tables may not exist if no dimensions configured - } - - const dimsRow = this.db.prepare("SELECT value FROM audrey_config WHERE key = 'dimensions'").get() as ConfigRow | undefined; - const dimensions = dimsRow ? parseInt(dimsRow.value, 10) : null; - const versionRow = this.db.prepare("SELECT value FROM audrey_config WHERE key = 'schema_version'").get() as ConfigRow | undefined; - const schemaVersion = versionRow ? parseInt(versionRow.value, 10) : 0; - - const device = this.embeddingProvider._actualDevice - ?? this.embeddingProvider.device - ?? null; - - const healthy = episodes === vecEpisodes - && semantics === vecSemantics - && procedures === vecProcedures; - const reembedRecommended = searchableEpisodes !== vecEpisodes - || searchableSemantics !== vecSemantics - || searchableProcedures !== vecProcedures; - - return { - episodes, - vec_episodes: vecEpisodes, - semantics, - vec_semantics: vecSemantics, - procedures, - vec_procedures: vecProcedures, - searchable_episodes: searchableEpisodes, - searchable_semantics: searchableSemantics, - searchable_procedures: searchableProcedures, - dimensions, - schema_version: schemaVersion, - device: device ?? null, - healthy, - reembed_recommended: reembedRecommended, - }; - } - - async greeting({ context, recentLimit = 10, principleLimit = 5, identityLimit = 5 }: GreetingOptions = {}): Promise { - const recent = this.db.prepare( - 'SELECT id, content, source, tags, salience, created_at FROM episodes WHERE "private" = 0 ORDER BY created_at DESC LIMIT ?' - ).all(recentLimit) as GreetingEpisodeRow[]; - - const principles = this.db.prepare( - 'SELECT id, content, salience, created_at FROM semantics WHERE state = ? ORDER BY salience DESC LIMIT ?' - ).all('active', principleLimit) as GreetingPrincipleRow[]; - - const identity = this.db.prepare( - 'SELECT id, content, tags, salience, created_at FROM episodes WHERE "private" = 1 ORDER BY created_at DESC LIMIT ?' - ).all(identityLimit) as GreetingIdentityRow[]; - - const unresolved = this.db.prepare( - "SELECT id, content, tags, salience, created_at FROM episodes WHERE tags LIKE '%unresolved%' AND salience > 0.3 ORDER BY created_at DESC LIMIT 10" - ).all() as GreetingUnresolvedRow[]; - - const rawAffectRows = this.db.prepare( - "SELECT affect FROM episodes WHERE affect IS NOT NULL AND affect != '{}' ORDER BY created_at DESC LIMIT 20" - ).all() as AffectRow[]; - - const affectParsed = rawAffectRows - .map(r => { try { return JSON.parse(r.affect) as Affect; } catch { return null; } }) - .filter((a): a is Affect => a !== null && a.valence !== undefined); - - let mood: { valence: number; arousal: number; samples: number }; - if (affectParsed.length === 0) { - mood = { valence: 0, arousal: 0, samples: 0 }; - } else { - const sumV = affectParsed.reduce((s, a) => s + (a.valence ?? 0), 0); - const sumA = affectParsed.reduce((s, a) => s + (a.arousal ?? 0), 0); - mood = { - valence: sumV / affectParsed.length, - arousal: sumA / affectParsed.length, - samples: affectParsed.length, - }; - } - - const result: GreetingResult = { recent, principles, mood, unresolved, identity }; - - if (context) { - result.contextual = await this.recall(context, { limit: 5, includePrivate: true }); - } - - return result; - } - - async dream(options: { - minClusterSize?: number; - similarityThreshold?: number; - dormantThreshold?: number; - } = {}): Promise { - await this._ensureMigrated(); - - const consolidation = await this.consolidate({ - minClusterSize: options.minClusterSize, - similarityThreshold: options.similarityThreshold, - }); - - const decay = this.decay({ - dormantThreshold: options.dormantThreshold, - }); - - const stats = this.introspect(); - - const result: DreamResult = { - consolidation, - decay, - stats, - }; - - this.emit('dream', result); - return result; - } - - export(): object { - return exportMemories(this.db); - } - - async import(snapshot: unknown): Promise { - return importMemories(this.db, this.embeddingProvider, snapshot); - } - - startAutoConsolidate(intervalMs: number, options: Partial = {}): void { - if (intervalMs < 1000) { - throw new Error('Auto-consolidation interval must be at least 1000ms'); - } - if (this._autoConsolidateTimer) { - throw new Error('Auto-consolidation is already running'); - } - this._autoConsolidateTimer = setInterval(() => { - this.consolidate(options).catch(err => this.emit('error', err)); - }, intervalMs); - if (typeof this._autoConsolidateTimer.unref === 'function') { - this._autoConsolidateTimer.unref(); - } - } - - stopAutoConsolidate(): void { - if (this._autoConsolidateTimer) { - clearInterval(this._autoConsolidateTimer); - this._autoConsolidateTimer = null; - } - } - - suggestConsolidationParams(): { minClusterSize: number; similarityThreshold: number; confidence: string } { - return suggestParamsFn(this.db); - } - - forget(id: string, options: { purge?: boolean } = {}): ForgetResult { - const result = forgetMemory(this.db, id, options); - this.emit('forget', result); - return result; - } - - async forgetByQuery(query: string, options: { minSimilarity?: number; purge?: boolean } = {}): Promise { - await this._ensureMigrated(); - const result = await forgetByQueryFn(this.db, this.embeddingProvider, query, options); - if (result) this.emit('forget', result); - return result; - } - - purge(): PurgeResult { - const result = purgeMemories(this.db); - this.emit('purge', result); - return result; - } - - close(): void { - if (this._closed) return; - this._closed = true; - this.stopAutoConsolidate(); - closeDatabase(this.db); - } - - async waitForIdle(): Promise { - return Promise.resolve(); - } - - observeTool(input: ObserveToolInput): ObserveToolResult { - const result = observeTool(this.db, { - ...input, - actorAgent: input.actorAgent ?? this.agent, - }); - this.emit('tool-observed', result.event); - return result; - } - - listEvents(query: EventQuery = {}): MemoryEvent[] { - return listEvents(this.db, query); - } - - countEvents(query: EventQuery = {}): number { - return countEvents(this.db, query); - } - - recentFailures(options: { since?: string; limit?: number } = {}): FailurePattern[] { - return recentFailures(this.db, options); - } - + type PromotionCandidate, + type PromotionTarget, +} from './promote.js'; +import { renderAllRules, type RuleDoc } from './rules-compiler.js'; +import { insertEvent } from './events.js'; +import { mkdirSync, writeFileSync, existsSync } from 'node:fs'; +import { dirname, join, resolve as pathResolve } from 'node:path'; + +interface ConfigRow { + value: string; +} + +interface CountRow { + c: number; +} + +interface ContentRow { + content: string; +} + +interface StatusRow { + status: string; +} + +interface AffectRow { + affect: string; +} + +interface GreetingEpisodeRow { + id: string; + content: string; + source: string; + tags: string | null; + salience: number; + created_at: string; +} + +interface GreetingPrincipleRow { + id: string; + content: string; + salience: number; + created_at: string; +} + +interface GreetingIdentityRow { + id: string; + content: string; + tags: string | null; + salience: number; + created_at: string; +} + +interface GreetingUnresolvedRow { + id: string; + content: string; + tags: string | null; + salience: number; + created_at: string; +} + +export class Audrey extends EventEmitter { + agent: string; + dataDir: string; + embeddingProvider: EmbeddingProvider; + db: Database.Database; + llmProvider: LLMProvider | null; + confidenceConfig: ConfidenceConfig; + consolidationConfig: { minEpisodes: number }; + decayConfig: { dormantThreshold: number }; + interferenceConfig: { enabled: boolean; k: number; threshold: number; weight: number }; + contextConfig: { enabled: boolean; weight: number }; + affectConfig: { + enabled: boolean; + weight: number; + arousalWeight: number; + resonance: { enabled: boolean; k: number; threshold: number; affectThreshold: number }; + }; + autoReflect: boolean; + + private _migrationPending: boolean; + private _autoConsolidateTimer: ReturnType | null; + private _closed: boolean; + + constructor({ + dataDir = './audrey-data', + agent = 'default', + embedding = { provider: 'mock', dimensions: 64 }, + llm, + confidence = {}, + consolidation = {}, + decay = {}, + interference = {}, + context = {}, + affect = {}, + autoReflect = false, + }: AudreyConfig = {}) { + super(); + + const dormantThreshold = decay.dormantThreshold ?? 0.1; + if (dormantThreshold < 0 || dormantThreshold > 1) { + throw new Error(`dormantThreshold must be between 0 and 1, got: ${dormantThreshold}`); + } + + const minEpisodes = consolidation.minEpisodes ?? 3; + if (!Number.isInteger(minEpisodes) || minEpisodes < 1) { + throw new Error(`minEpisodes must be a positive integer, got: ${minEpisodes}`); + } + + this.agent = agent; + this.dataDir = dataDir; + this.embeddingProvider = createEmbeddingProvider(embedding); + const { db, migrated } = createDatabase(dataDir, { dimensions: this.embeddingProvider.dimensions }); + this.db = db; + this._migrationPending = migrated; + this.llmProvider = llm ? createLLMProvider(llm) : null; + this.confidenceConfig = { + weights: confidence.weights, + halfLives: confidence.halfLives, + sourceReliability: confidence.sourceReliability, + interferenceWeight: interference.weight ?? 0.1, + contextWeight: context.weight ?? 0.3, + affectWeight: affect.weight ?? 0.2, + }; + this.consolidationConfig = { + minEpisodes: consolidation.minEpisodes || 3, + }; + this.decayConfig = { dormantThreshold: decay.dormantThreshold || 0.1 }; + this._autoConsolidateTimer = null; + this._closed = false; + this.interferenceConfig = { + enabled: interference.enabled ?? true, + k: interference.k ?? 5, + threshold: interference.threshold ?? 0.6, + weight: interference.weight ?? 0.1, + }; + this.contextConfig = { + enabled: context.enabled ?? true, + weight: context.weight ?? 0.3, + }; + this.affectConfig = { + enabled: affect.enabled ?? true, + weight: affect.weight ?? 0.2, + arousalWeight: affect.arousalWeight ?? 0.3, + resonance: { + enabled: affect.resonance?.enabled ?? true, + k: affect.resonance?.k ?? 5, + threshold: affect.resonance?.threshold ?? 0.5, + affectThreshold: affect.resonance?.affectThreshold ?? 0.6, + }, + }; + this.autoReflect = autoReflect; + } + + async _ensureMigrated(): Promise { + if (!this._migrationPending) return; + const counts = await reembedAll(this.db, this.embeddingProvider); + this._migrationPending = false; + this.emit('migration', counts); + } + + _emitValidation(id: string, params: EncodeParams): void { + validateMemory(this.db, this.embeddingProvider, { id, ...params }, { + llmProvider: this.llmProvider, + }) + .then(validation => { + if (validation.action === 'reinforced') { + this.emit('reinforcement', { + episodeId: id, + targetId: validation.semanticId, + similarity: validation.similarity, + }); + } else if (validation.action === 'contradiction') { + this.emit('contradiction', { + episodeId: id, + contradictionId: validation.contradictionId, + semanticId: validation.semanticId, + similarity: validation.similarity, + resolution: validation.resolution, + }); + } + }) + .catch(err => this.emit('error', err)); + } + + async encode(params: EncodeParams): Promise { + await this._ensureMigrated(); + const encodeParams = { ...params, arousalWeight: this.affectConfig.arousalWeight }; + const id = await encodeEpisode(this.db, this.embeddingProvider, encodeParams); + this.emit('encode', { id, ...params }); + if (this.interferenceConfig.enabled) { + applyInterference(this.db, this.embeddingProvider, id, params, this.interferenceConfig) + .then(affected => { + if (affected.length > 0) { + this.emit('interference', { episodeId: id, affected }); + } + }) + .catch(err => this.emit('error', err)); + } + if (this.affectConfig.enabled && this.affectConfig.resonance.enabled && params.affect?.valence !== undefined) { + detectResonance(this.db, this.embeddingProvider, id, params, this.affectConfig.resonance) + .then(echoes => { + if (echoes.length > 0) { + this.emit('resonance', { episodeId: id, affect: params.affect, echoes }); + } + }) + .catch(err => this.emit('error', err)); + } + this._emitValidation(id, params); + return id; + } + + async reflect(turns: { role: string; content: string }[]): Promise { + if (!this.llmProvider) return { encoded: 0, memories: [], skipped: 'no llm provider' }; + + const prompt = buildReflectionPrompt(turns); + let raw: string; + try { + raw = await this.llmProvider.chat!(prompt as unknown as string) as string; + } catch (err) { + this.emit('error', err); + return { encoded: 0, memories: [], skipped: 'llm error' }; + } + + let parsed: { memories?: Array<{ content?: string; source?: string; salience?: number; tags?: string[]; private?: boolean; affect?: Affect }> }; + try { + parsed = JSON.parse(raw); + } catch { + return { encoded: 0, memories: [], skipped: 'invalid llm response' }; + } + + const memories = parsed.memories ?? []; + let encoded = 0; + for (const mem of memories) { + if (!mem.content || !mem.source) continue; + try { + await this.encode({ + content: mem.content, + source: mem.source as EncodeParams['source'], + salience: mem.salience, + tags: mem.tags, + private: mem.private ?? false, + affect: mem.affect ?? undefined, + }); + encoded++; + } catch (err) { + this.emit('error', err); + } + } + + return { encoded, memories: memories as ReflectResult['memories'] }; + } + + async encodeBatch(paramsList: EncodeParams[]): Promise { + await this._ensureMigrated(); + const ids: string[] = []; + for (const params of paramsList) { + const id = await encodeEpisode(this.db, this.embeddingProvider, params); + ids.push(id); + this.emit('encode', { id, ...params }); + } + + for (let i = 0; i < ids.length; i++) { + this._emitValidation(ids[i]!, paramsList[i]!); + } + + return ids; + } + + async recall(query: string, options: RecallOptions = {}): Promise { + await this._ensureMigrated(); + return recallFn(this.db, this.embeddingProvider, query, { + ...options, + confidenceConfig: this._recallConfig(options), + }); + } + + async *recallStream(query: string, options: RecallOptions = {}): AsyncGenerator { + await this._ensureMigrated(); + yield* recallStreamFn(this.db, this.embeddingProvider, query, { + ...options, + confidenceConfig: this._recallConfig(options), + }); + } + + _recallConfig(options: RecallOptions): ConfidenceConfig { + let config: ConfidenceConfig = options.confidenceConfig ?? this.confidenceConfig; + if (this.contextConfig.enabled && options.context) { + config = { ...config, retrievalContext: options.context }; + } + if (this.affectConfig.enabled && options.mood) { + config = { ...config, retrievalMood: options.mood }; + } + return config; + } + + async consolidate(options: Partial = {}): Promise { + await this._ensureMigrated(); + const result = await runConsolidation(this.db, this.embeddingProvider, { + minClusterSize: options.minClusterSize || this.consolidationConfig.minEpisodes, + similarityThreshold: options.similarityThreshold || 0.80, + extractPrinciple: options.extractPrinciple, + llmProvider: options.llmProvider || this.llmProvider || undefined, + }); + const run = db_prepare_get_status(this.db, result.runId); + const output = { ...result, status: run?.status || 'completed' }; + this.emit('consolidation', output); + return output; + } + + decay(options: { dormantThreshold?: number; halfLives?: Partial } = {}): DecayResult { + const result = applyDecay(this.db, { + dormantThreshold: options.dormantThreshold || this.decayConfig.dormantThreshold, + halfLives: options.halfLives ?? this.confidenceConfig.halfLives, + }); + this.emit('decay', result); + return result; + } + + rollback(runId: string): { rolledBackMemories: number; restoredEpisodes: number } { + const result = rollbackConsolidation(this.db, runId); + this.emit('rollback', { runId, ...result }); + return result; + } + + async resolveTruth(contradictionId: string): Promise { + if (!this.llmProvider) { + throw new Error('resolveTruth requires an LLM provider'); + } + + const contradiction = this.db.prepare( + 'SELECT * FROM contradictions WHERE id = ?' + ).get(contradictionId) as { claim_a_id: string; claim_a_type: string; claim_b_id: string; claim_b_type: string } | undefined; + if (!contradiction) throw new Error(`Contradiction not found: ${contradictionId}`); + + const claimA = this._loadClaimContent(contradiction.claim_a_id, contradiction.claim_a_type); + const claimB = this._loadClaimContent(contradiction.claim_b_id, contradiction.claim_b_type); + + const messages = buildContextResolutionPrompt(claimA, claimB); + const result = await this.llmProvider.json(messages) as TruthResolution; + + const now = new Date().toISOString(); + const newState = result.resolution === 'context_dependent' ? 'context_dependent' : 'resolved'; + this.db.prepare(` + UPDATE contradictions SET state = ?, resolution = ?, resolved_at = ? + WHERE id = ? + `).run(newState, JSON.stringify(result), now, contradictionId); + + if (result.resolution === 'a_wins' && contradiction.claim_a_type === 'semantic') { + this.db.prepare("UPDATE semantics SET state = 'active' WHERE id = ?").run(contradiction.claim_a_id); + } + if (result.resolution === 'b_wins' && contradiction.claim_b_type === 'semantic') { + this.db.prepare("UPDATE semantics SET state = 'active' WHERE id = ?").run(contradiction.claim_b_id); + } + if (result.resolution === 'context_dependent') { + if (contradiction.claim_a_type === 'semantic' && result.conditions) { + this.db.prepare("UPDATE semantics SET state = 'context_dependent', conditions = ? WHERE id = ?") + .run(JSON.stringify(result.conditions), contradiction.claim_a_id); + } + } + + return result; + } + + _loadClaimContent(claimId: string, claimType: string): string { + if (claimType === 'semantic') { + const row = this.db.prepare('SELECT content FROM semantics WHERE id = ?').get(claimId) as ContentRow | undefined; + if (!row) throw new Error(`Semantic memory not found: ${claimId}`); + return row.content; + } else if (claimType === 'episodic') { + const row = this.db.prepare('SELECT content FROM episodes WHERE id = ?').get(claimId) as ContentRow | undefined; + if (!row) throw new Error(`Episode not found: ${claimId}`); + return row.content; + } + throw new Error(`Unknown claim type: ${claimType}`); + } + + consolidationHistory(): ConsolidationRunRow[] { + return getConsolidationHistory(this.db); + } + + introspect(): IntrospectResult { + return introspectFn(this.db); + } + + memoryStatus(): MemoryStatusResult { + const episodes = (this.db.prepare('SELECT COUNT(*) as c FROM episodes').get() as CountRow).c; + const semantics = (this.db.prepare('SELECT COUNT(*) as c FROM semantics').get() as CountRow).c; + const procedures = (this.db.prepare('SELECT COUNT(*) as c FROM procedures').get() as CountRow).c; + const searchableEpisodes = (this.db.prepare('SELECT COUNT(*) as c FROM episodes WHERE embedding IS NOT NULL').get() as CountRow).c; + const searchableSemantics = (this.db.prepare('SELECT COUNT(*) as c FROM semantics WHERE embedding IS NOT NULL').get() as CountRow).c; + const searchableProcedures = (this.db.prepare('SELECT COUNT(*) as c FROM procedures WHERE embedding IS NOT NULL').get() as CountRow).c; + + let vecEpisodes = 0, vecSemantics = 0, vecProcedures = 0; + try { + vecEpisodes = (this.db.prepare('SELECT COUNT(*) as c FROM vec_episodes').get() as CountRow).c; + vecSemantics = (this.db.prepare('SELECT COUNT(*) as c FROM vec_semantics').get() as CountRow).c; + vecProcedures = (this.db.prepare('SELECT COUNT(*) as c FROM vec_procedures').get() as CountRow).c; + } catch { + // vec tables may not exist if no dimensions configured + } + + const dimsRow = this.db.prepare("SELECT value FROM audrey_config WHERE key = 'dimensions'").get() as ConfigRow | undefined; + const dimensions = dimsRow ? parseInt(dimsRow.value, 10) : null; + const versionRow = this.db.prepare("SELECT value FROM audrey_config WHERE key = 'schema_version'").get() as ConfigRow | undefined; + const schemaVersion = versionRow ? parseInt(versionRow.value, 10) : 0; + + const device = this.embeddingProvider._actualDevice + ?? this.embeddingProvider.device + ?? null; + + const healthy = episodes === vecEpisodes + && semantics === vecSemantics + && procedures === vecProcedures; + const reembedRecommended = searchableEpisodes !== vecEpisodes + || searchableSemantics !== vecSemantics + || searchableProcedures !== vecProcedures; + + return { + episodes, + vec_episodes: vecEpisodes, + semantics, + vec_semantics: vecSemantics, + procedures, + vec_procedures: vecProcedures, + searchable_episodes: searchableEpisodes, + searchable_semantics: searchableSemantics, + searchable_procedures: searchableProcedures, + dimensions, + schema_version: schemaVersion, + device: device ?? null, + healthy, + reembed_recommended: reembedRecommended, + }; + } + + async greeting({ context, recentLimit = 10, principleLimit = 5, identityLimit = 5 }: GreetingOptions = {}): Promise { + const recent = this.db.prepare( + 'SELECT id, content, source, tags, salience, created_at FROM episodes WHERE "private" = 0 ORDER BY created_at DESC LIMIT ?' + ).all(recentLimit) as GreetingEpisodeRow[]; + + const principles = this.db.prepare( + 'SELECT id, content, salience, created_at FROM semantics WHERE state = ? ORDER BY salience DESC LIMIT ?' + ).all('active', principleLimit) as GreetingPrincipleRow[]; + + const identity = this.db.prepare( + 'SELECT id, content, tags, salience, created_at FROM episodes WHERE "private" = 1 ORDER BY created_at DESC LIMIT ?' + ).all(identityLimit) as GreetingIdentityRow[]; + + const unresolved = this.db.prepare( + "SELECT id, content, tags, salience, created_at FROM episodes WHERE tags LIKE '%unresolved%' AND salience > 0.3 ORDER BY created_at DESC LIMIT 10" + ).all() as GreetingUnresolvedRow[]; + + const rawAffectRows = this.db.prepare( + "SELECT affect FROM episodes WHERE affect IS NOT NULL AND affect != '{}' ORDER BY created_at DESC LIMIT 20" + ).all() as AffectRow[]; + + const affectParsed = rawAffectRows + .map(r => { try { return JSON.parse(r.affect) as Affect; } catch { return null; } }) + .filter((a): a is Affect => a !== null && a.valence !== undefined); + + let mood: { valence: number; arousal: number; samples: number }; + if (affectParsed.length === 0) { + mood = { valence: 0, arousal: 0, samples: 0 }; + } else { + const sumV = affectParsed.reduce((s, a) => s + (a.valence ?? 0), 0); + const sumA = affectParsed.reduce((s, a) => s + (a.arousal ?? 0), 0); + mood = { + valence: sumV / affectParsed.length, + arousal: sumA / affectParsed.length, + samples: affectParsed.length, + }; + } + + const result: GreetingResult = { recent, principles, mood, unresolved, identity }; + + if (context) { + result.contextual = await this.recall(context, { limit: 5, includePrivate: true }); + } + + return result; + } + + async dream(options: { + minClusterSize?: number; + similarityThreshold?: number; + dormantThreshold?: number; + } = {}): Promise { + await this._ensureMigrated(); + + const consolidation = await this.consolidate({ + minClusterSize: options.minClusterSize, + similarityThreshold: options.similarityThreshold, + }); + + const decay = this.decay({ + dormantThreshold: options.dormantThreshold, + }); + + const stats = this.introspect(); + + const result: DreamResult = { + consolidation, + decay, + stats, + }; + + this.emit('dream', result); + return result; + } + + export(): object { + return exportMemories(this.db); + } + + async import(snapshot: unknown): Promise { + return importMemories(this.db, this.embeddingProvider, snapshot); + } + + startAutoConsolidate(intervalMs: number, options: Partial = {}): void { + if (intervalMs < 1000) { + throw new Error('Auto-consolidation interval must be at least 1000ms'); + } + if (this._autoConsolidateTimer) { + throw new Error('Auto-consolidation is already running'); + } + this._autoConsolidateTimer = setInterval(() => { + this.consolidate(options).catch(err => this.emit('error', err)); + }, intervalMs); + if (typeof this._autoConsolidateTimer.unref === 'function') { + this._autoConsolidateTimer.unref(); + } + } + + stopAutoConsolidate(): void { + if (this._autoConsolidateTimer) { + clearInterval(this._autoConsolidateTimer); + this._autoConsolidateTimer = null; + } + } + + suggestConsolidationParams(): { minClusterSize: number; similarityThreshold: number; confidence: string } { + return suggestParamsFn(this.db); + } + + forget(id: string, options: { purge?: boolean } = {}): ForgetResult { + const result = forgetMemory(this.db, id, options); + this.emit('forget', result); + return result; + } + + async forgetByQuery(query: string, options: { minSimilarity?: number; purge?: boolean } = {}): Promise { + await this._ensureMigrated(); + const result = await forgetByQueryFn(this.db, this.embeddingProvider, query, options); + if (result) this.emit('forget', result); + return result; + } + + purge(): PurgeResult { + const result = purgeMemories(this.db); + this.emit('purge', result); + return result; + } + + close(): void { + if (this._closed) return; + this._closed = true; + this.stopAutoConsolidate(); + closeDatabase(this.db); + } + + async waitForIdle(): Promise { + return Promise.resolve(); + } + + observeTool(input: ObserveToolInput): ObserveToolResult { + const result = observeTool(this.db, { + ...input, + actorAgent: input.actorAgent ?? this.agent, + }); + this.emit('tool-observed', result.event); + return result; + } + + listEvents(query: EventQuery = {}): MemoryEvent[] { + return listEvents(this.db, query); + } + + countEvents(query: EventQuery = {}): number { + return countEvents(this.db, query); + } + + recentFailures(options: { since?: string; limit?: number } = {}): FailurePattern[] { + return recentFailures(this.db, options); + } + async capsule(query: string, options: CapsuleOptions = {}): Promise { const capsule = await buildCapsule(this, query, options); this.emit('capsule', capsule); return capsule; } - findPromotionCandidates(options: FindCandidatesOptions = {}): PromotionCandidate[] { - return findPromotionCandidates(this.db, options); + async preflight(action: string, options: PreflightOptions = {}): Promise { + const preflight = await buildPreflight(this, action, options); + this.emit('preflight', preflight); + return preflight; } - async promote(options: PromoteOptions = {}): Promise { - const target: PromotionTarget = options.target ?? 'claude-rules'; - if (target !== 'claude-rules') { - throw new Error(`promote target "${target}" is not implemented yet. PR 4 v1 ships claude-rules only.`); - } - - const candidates = findPromotionCandidates(this.db, { - minConfidence: options.minConfidence, - minEvidence: options.minEvidence, - limit: options.limit, - target, - }); - - const dryRun = options.dryRun ?? !options.yes; - const projectDir = pathResolve(options.projectDir ?? process.cwd()); - const promotedAt = new Date().toISOString(); - const docs = renderAllRules(candidates, promotedAt); - - const applied: PromotionWriteResult[] = []; - - if (!dryRun) { - for (let i = 0; i < candidates.length; i++) { - const candidate = candidates[i]!; - const doc = docs[i]!; - const absolutePath = join(projectDir, doc.relativePath); - mkdirSync(dirname(absolutePath), { recursive: true }); - const overwritten = existsSync(absolutePath); - writeFileSync(absolutePath, doc.body, 'utf-8'); - - insertEvent(this.db, { - eventType: 'Promotion', - source: 'promote-command', - actorAgent: this.agent, - toolName: target, - outcome: 'succeeded', - cwd: projectDir, - fileFingerprints: [doc.relativePath], - redactionState: 'clean', - metadata: { - memory_ids: [candidate.memory_id], - memory_type: candidate.memory_type, - candidate_id: candidate.candidate_id, - confidence: Number(candidate.confidence.toFixed(3)), - evidence_count: candidate.evidence_count, - failure_prevented: candidate.failure_prevented, - score: Number(candidate.score.toFixed(2)), - target, - absolute_path: absolutePath, - relative_path: doc.relativePath, - overwritten, - }, - }); - - applied.push({ - candidate_id: candidate.candidate_id, - memory_id: candidate.memory_id, - target, - relative_path: doc.relativePath, - absolute_path: absolutePath, - overwritten, - }); - } - } - - const result: PromoteResult = { - target, - dry_run: dryRun, - project_dir: projectDir, - promoted_at: promotedAt, - candidates: candidates.map((c, i) => ({ - ...c, - rendered_path: docs[i]!.relativePath, - })), - applied, - }; - this.emit('promote', result); - return result; + async reflexes(action: string, options: ReflexOptions = {}): Promise { + const report = await buildReflexReport(this, action, options); + this.emit('reflexes', report); + return report; } -} - -export interface PromoteOptions { - target?: PromotionTarget; - minConfidence?: number; - minEvidence?: number; - limit?: number; - dryRun?: boolean; - yes?: boolean; - projectDir?: string; -} - -export interface PromotionCandidateWithPath extends PromotionCandidate { - rendered_path: string; -} - -export interface PromotionWriteResult { - candidate_id: string; - memory_id: string; - target: PromotionTarget; - relative_path: string; - absolute_path: string; - overwritten: boolean; -} -export interface PromoteResult { - target: PromotionTarget; - dry_run: boolean; - project_dir: string; - promoted_at: string; - candidates: PromotionCandidateWithPath[]; - applied: PromotionWriteResult[]; -} - -// Re-exports so the rules-compiler output is easy to consume by callers. -export type { RuleDoc }; - -function db_prepare_get_status(db: Database.Database, runId: string): StatusRow | undefined { - return db.prepare('SELECT status FROM consolidation_runs WHERE id = ?').get(runId) as StatusRow | undefined; -} + findPromotionCandidates(options: FindCandidatesOptions = {}): PromotionCandidate[] { + return findPromotionCandidates(this.db, options); + } + + async promote(options: PromoteOptions = {}): Promise { + const target: PromotionTarget = options.target ?? 'claude-rules'; + if (target !== 'claude-rules') { + throw new Error(`promote target "${target}" is not implemented yet. PR 4 v1 ships claude-rules only.`); + } + + const candidates = findPromotionCandidates(this.db, { + minConfidence: options.minConfidence, + minEvidence: options.minEvidence, + limit: options.limit, + target, + }); + + const dryRun = options.dryRun ?? !options.yes; + const projectDir = pathResolve(options.projectDir ?? process.cwd()); + const promotedAt = new Date().toISOString(); + const docs = renderAllRules(candidates, promotedAt); + + const applied: PromotionWriteResult[] = []; + + if (!dryRun) { + for (let i = 0; i < candidates.length; i++) { + const candidate = candidates[i]!; + const doc = docs[i]!; + const absolutePath = join(projectDir, doc.relativePath); + mkdirSync(dirname(absolutePath), { recursive: true }); + const overwritten = existsSync(absolutePath); + writeFileSync(absolutePath, doc.body, 'utf-8'); + + insertEvent(this.db, { + eventType: 'Promotion', + source: 'promote-command', + actorAgent: this.agent, + toolName: target, + outcome: 'succeeded', + cwd: projectDir, + fileFingerprints: [doc.relativePath], + redactionState: 'clean', + metadata: { + memory_ids: [candidate.memory_id], + memory_type: candidate.memory_type, + candidate_id: candidate.candidate_id, + confidence: Number(candidate.confidence.toFixed(3)), + evidence_count: candidate.evidence_count, + failure_prevented: candidate.failure_prevented, + score: Number(candidate.score.toFixed(2)), + target, + absolute_path: absolutePath, + relative_path: doc.relativePath, + overwritten, + }, + }); + + applied.push({ + candidate_id: candidate.candidate_id, + memory_id: candidate.memory_id, + target, + relative_path: doc.relativePath, + absolute_path: absolutePath, + overwritten, + }); + } + } + + const result: PromoteResult = { + target, + dry_run: dryRun, + project_dir: projectDir, + promoted_at: promotedAt, + candidates: candidates.map((c, i) => ({ + ...c, + rendered_path: docs[i]!.relativePath, + })), + applied, + }; + this.emit('promote', result); + return result; + } +} + +export interface PromoteOptions { + target?: PromotionTarget; + minConfidence?: number; + minEvidence?: number; + limit?: number; + dryRun?: boolean; + yes?: boolean; + projectDir?: string; +} + +export interface PromotionCandidateWithPath extends PromotionCandidate { + rendered_path: string; +} + +export interface PromotionWriteResult { + candidate_id: string; + memory_id: string; + target: PromotionTarget; + relative_path: string; + absolute_path: string; + overwritten: boolean; +} + +export interface PromoteResult { + target: PromotionTarget; + dry_run: boolean; + project_dir: string; + promoted_at: string; + candidates: PromotionCandidateWithPath[]; + applied: PromotionWriteResult[]; +} + +// Re-exports so the rules-compiler output is easy to consume by callers. +export type { RuleDoc }; + +function db_prepare_get_status(db: Database.Database, runId: string): StatusRow | undefined { + return db.prepare('SELECT status FROM consolidation_runs WHERE id = ?').get(runId) as StatusRow | undefined; +} diff --git a/src/index.ts b/src/index.ts index 7b3f75d..c369594 100644 --- a/src/index.ts +++ b/src/index.ts @@ -1,89 +1,105 @@ -export { Audrey } from './audrey.js'; -export { startServer } from './server.js'; -export type { ServerOptions } from './server.js'; -export { createApp } from './routes.js'; -export type { AppOptions } from './routes.js'; -export { computeConfidence, sourceReliability, salienceModifier, DEFAULT_SOURCE_RELIABILITY, DEFAULT_WEIGHTS, DEFAULT_HALF_LIVES } from './confidence.js'; -export { - createEmbeddingProvider, - MockEmbeddingProvider, - LocalEmbeddingProvider, - OpenAIEmbeddingProvider, - GeminiEmbeddingProvider, -} from './embedding.js'; -export { createLLMProvider, MockLLMProvider, AnthropicLLMProvider, OpenAILLMProvider } from './llm.js'; -export { createDatabase, closeDatabase, readStoredDimensions } from './db.js'; -export { recall, recallStream } from './recall.js'; -export { addCausalLink, getCausalChain, articulateCausalLink } from './causal.js'; -export { - buildPrincipleExtractionPrompt, - buildContradictionDetectionPrompt, - buildCausalArticulationPrompt, - buildContextResolutionPrompt, -} from './prompts.js'; -export { exportMemories } from './export.js'; -export { importMemories } from './import.js'; +export { Audrey } from './audrey.js'; +export { startServer } from './server.js'; +export type { ServerOptions } from './server.js'; +export { createApp } from './routes.js'; +export type { AppOptions } from './routes.js'; +export { computeConfidence, sourceReliability, salienceModifier, DEFAULT_SOURCE_RELIABILITY, DEFAULT_WEIGHTS, DEFAULT_HALF_LIVES } from './confidence.js'; +export { + createEmbeddingProvider, + MockEmbeddingProvider, + LocalEmbeddingProvider, + OpenAIEmbeddingProvider, + GeminiEmbeddingProvider, +} from './embedding.js'; +export { createLLMProvider, MockLLMProvider, AnthropicLLMProvider, OpenAILLMProvider } from './llm.js'; +export { createDatabase, closeDatabase, readStoredDimensions } from './db.js'; +export { recall, recallStream } from './recall.js'; +export { addCausalLink, getCausalChain, articulateCausalLink } from './causal.js'; +export { + buildPrincipleExtractionPrompt, + buildContradictionDetectionPrompt, + buildCausalArticulationPrompt, + buildContextResolutionPrompt, +} from './prompts.js'; +export { exportMemories } from './export.js'; +export { importMemories } from './import.js'; export { suggestConsolidationParams } from './adaptive.js'; export { reembedAll } from './migrate.js'; export { forgetMemory, forgetByQuery, purgeMemories } from './forget.js'; export { applyInterference, interferenceModifier } from './interference.js'; export { contextMatchRatio, contextModifier } from './context.js'; export { arousalSalienceBoost, affectSimilarity, moodCongruenceModifier, detectResonance } from './affect.js'; +export { buildPreflight } from './preflight.js'; +export type { + MemoryPreflight, + PreflightDecision, + PreflightOptions, + PreflightSeverity, + PreflightWarning, + PreflightWarningType, +} from './preflight.js'; +export { buildReflexReport } from './reflexes.js'; +export type { + MemoryReflex, + MemoryReflexReport, + ReflexOptions, + ReflexResponseType, +} from './reflexes.js'; export type { - Affect, - AudreyConfig, - CausalLinkRow, - CausalLinkType, - CausalParams, - ChatMessage, - ConfidenceConfig, - ConfidenceWeights, - ComputeConfidenceParams, - ConsolidationMetricRow, - ConsolidationOptions, - ConsolidationResult, - ConsolidationRunRow, - ConsolidationStatus, - ContradictionCounts, - ContradictionRow, - ContradictionState, - ContextConfig, - Database, - DecayResult, - DreamResult, - EmbeddingConfig, - EmbeddingProvider, - EncodeParams, - EpisodeRow, - EpisodicProvenance, - ExtractedPrinciple, - ForgetResult, - GreetingOptions, - GreetingResult, - HalfLives, - InterferenceConfig, - IntrospectResult, - LLMCompletionOptions, - LLMCompletionResult, - LLMConfig, - LLMProvider, - MemoryState, - MemoryStatusResult, - MemoryType, - ProceduralProvenance, - ProceduralRow, - PurgeResult, - RecallOptions, - RecallResult, - ReembedCounts, - ReflectMemory, - ReflectResult, - ResonanceConfig, - SemanticProvenance, - SemanticRow, - SourceReliabilityMap, - SourceType, - TruthResolution, - AffectConfig, -} from './types.js'; + Affect, + AudreyConfig, + CausalLinkRow, + CausalLinkType, + CausalParams, + ChatMessage, + ConfidenceConfig, + ConfidenceWeights, + ComputeConfidenceParams, + ConsolidationMetricRow, + ConsolidationOptions, + ConsolidationResult, + ConsolidationRunRow, + ConsolidationStatus, + ContradictionCounts, + ContradictionRow, + ContradictionState, + ContextConfig, + Database, + DecayResult, + DreamResult, + EmbeddingConfig, + EmbeddingProvider, + EncodeParams, + EpisodeRow, + EpisodicProvenance, + ExtractedPrinciple, + ForgetResult, + GreetingOptions, + GreetingResult, + HalfLives, + InterferenceConfig, + IntrospectResult, + LLMCompletionOptions, + LLMCompletionResult, + LLMConfig, + LLMProvider, + MemoryState, + MemoryStatusResult, + MemoryType, + ProceduralProvenance, + ProceduralRow, + PurgeResult, + RecallOptions, + RecallResult, + ReembedCounts, + ReflectMemory, + ReflectResult, + ResonanceConfig, + SemanticProvenance, + SemanticRow, + SourceReliabilityMap, + SourceType, + TruthResolution, + AffectConfig, +} from './types.js'; diff --git a/src/preflight.ts b/src/preflight.ts new file mode 100644 index 0000000..f3edc6a --- /dev/null +++ b/src/preflight.ts @@ -0,0 +1,322 @@ +import type { Audrey } from './audrey.js'; +import type { CapsuleEntry, CapsuleMode, MemoryCapsule } from './capsule.js'; +import type { FailurePattern } from './events.js'; +import type { MemoryStatusResult } from './types.js'; + +export type PreflightDecision = 'go' | 'caution' | 'block'; +export type PreflightSeverity = 'info' | 'low' | 'medium' | 'high'; +export type PreflightWarningType = + | 'recent_failure' + | 'must_follow' + | 'risk' + | 'procedure' + | 'contradiction' + | 'uncertain' + | 'memory_health'; + +export interface PreflightOptions { + tool?: string; + sessionId?: string; + cwd?: string; + files?: string[]; + strict?: boolean; + limit?: number; + budgetChars?: number; + mode?: CapsuleMode; + recentFailureWindowHours?: number; + recentChangeWindowHours?: number; + includeCapsule?: boolean; + includeStatus?: boolean; + recordEvent?: boolean; +} + +export interface PreflightWarning { + type: PreflightWarningType; + severity: PreflightSeverity; + message: string; + reason: string; + evidence_id?: string; + recommended_action?: string; +} + +export interface MemoryPreflight { + action: string; + query: string; + tool?: string; + cwd?: string; + generated_at: string; + decision: PreflightDecision; + verdict: 'clear' | 'caution' | 'blocked'; + ok_to_proceed: boolean; + risk_score: number; + summary: string; + warnings: PreflightWarning[]; + recent_failures: FailurePattern[]; + status?: MemoryStatusResult; + recommended_actions: string[]; + evidence_ids: string[]; + preflight_event_id?: string; + capsule?: MemoryCapsule; +} + +const SEVERITY_SCORE: Record = { + info: 0.1, + low: 0.25, + medium: 0.55, + high: 0.85, +}; + +function isNonEmptyText(value: string): boolean { + return value.trim().length > 0; +} + +function shorten(value: string, max = 320): string { + const text = value.replace(/\s+/g, ' ').trim(); + if (text.length <= max) return text; + return `${text.slice(0, max - 1)}...`; +} + +function matchesToolOrAction( + action: string, + requestedTool: string | undefined, + failedTool: string | null | undefined, +): boolean { + if (!failedTool) return false; + const failed = failedTool.toLowerCase(); + const tool = requestedTool?.toLowerCase(); + const actionText = action.toLowerCase(); + + return Boolean( + (tool && (failed === tool || failed.includes(tool) || tool.includes(failed))) + || actionText.includes(failed) + ); +} + +function warningFromEntry( + type: PreflightWarningType, + severity: PreflightSeverity, + entry: CapsuleEntry, + fallbackAction: string, +): PreflightWarning { + return { + type, + severity, + message: shorten(entry.content), + reason: entry.reason, + evidence_id: entry.memory_id, + recommended_action: entry.recommended_action ?? fallbackAction, + }; +} + +function addWarning( + warnings: PreflightWarning[], + seen: Set, + warning: PreflightWarning, +): void { + const key = `${warning.type}:${warning.evidence_id ?? warning.message}`; + if (seen.has(key)) return; + seen.add(key); + warnings.push(warning); +} + +function recommendationFromWarning(warning: PreflightWarning): string { + if (warning.recommended_action) return warning.recommended_action; + switch (warning.type) { + case 'recent_failure': + return 'Review the prior failure before running the same tool again.'; + case 'must_follow': + return 'Apply the must-follow memory before acting.'; + case 'risk': + return 'Mitigate the remembered risk before proceeding.'; + case 'procedure': + return 'Use the remembered procedure as the execution path.'; + case 'contradiction': + return 'Resolve or scope the contradiction before relying on either claim.'; + case 'uncertain': + return 'Treat the low-confidence memory as a check, not as settled truth.'; + case 'memory_health': + return 'Repair memory health before relying on recall-sensitive decisions.'; + } +} + +function buildSummary(decision: PreflightDecision, warnings: PreflightWarning[]): string { + if (warnings.length === 0) { + return 'No relevant memory risks, prior failures, or must-follow procedures were found.'; + } + + const high = warnings.filter(w => w.severity === 'high').length; + const medium = warnings.filter(w => w.severity === 'medium').length; + const parts = [`${warnings.length} memory signal${warnings.length === 1 ? '' : 's'}`]; + if (high > 0) parts.push(`${high} high severity`); + if (medium > 0) parts.push(`${medium} medium severity`); + return `${decision === 'block' ? 'Blocked' : 'Caution'}: ${parts.join(', ')} found before acting.`; +} + +export async function buildPreflight( + audrey: Audrey, + action: string, + options: PreflightOptions = {}, +): Promise { + if (!isNonEmptyText(action)) { + throw new Error('action must be a non-empty string'); + } + + const queryParts = [ + action.trim(), + options.tool ? `tool:${options.tool}` : '', + options.cwd ? `cwd:${options.cwd}` : '', + ].filter(Boolean); + const query = queryParts.join('\n'); + const capsule = await audrey.capsule(query, { + limit: options.limit ?? 12, + budgetChars: options.budgetChars ?? 3000, + mode: options.mode ?? 'conservative', + recentChangeWindowHours: options.recentChangeWindowHours ?? 72, + includeRisks: true, + includeContradictions: true, + }); + + const warnings: PreflightWarning[] = []; + const seen = new Set(); + const includeStatus = options.includeStatus ?? true; + const status = includeStatus ? audrey.memoryStatus() : undefined; + + if (status && !status.healthy) { + addWarning(warnings, seen, { + type: 'memory_health', + severity: 'high', + message: 'Audrey memory index is unhealthy; recall may be incomplete or stale.', + reason: 'memoryStatus().healthy is false.', + recommended_action: 'Run npx audrey status and npx audrey reembed before depending on memory.', + }); + } else if (status?.reembed_recommended) { + addWarning(warnings, seen, { + type: 'memory_health', + severity: 'medium', + message: 'Audrey recommends re-embedding before recall-sensitive work.', + reason: 'memoryStatus().reembed_recommended is true.', + recommended_action: 'Run npx audrey reembed during a safe maintenance window.', + }); + } + + const since = new Date( + Date.now() - (options.recentFailureWindowHours ?? 168) * 60 * 60 * 1000, + ).toISOString(); + const recentFailures = audrey.recentFailures({ since, limit: 20 }); + const matchingFailures: FailurePattern[] = []; + for (const failure of recentFailures) { + if (!matchesToolOrAction(action, options.tool, failure.tool_name)) continue; + matchingFailures.push(failure); + const toolLabel = failure.tool_name || options.tool || 'tool'; + addWarning(warnings, seen, { + type: 'recent_failure', + severity: failure.failure_count >= 3 ? 'high' : 'medium', + message: failure.last_error_summary + ? `${toolLabel} failed ${failure.failure_count}x recently: ${shorten(failure.last_error_summary, 220)}` + : `${toolLabel} failed ${failure.failure_count}x recently.`, + reason: 'Matched a recent failed tool event for this action.', + evidence_id: `failure:${toolLabel}:${failure.last_failed_at}`, + recommended_action: `Before re-running ${toolLabel}, check what changed since the last failure.`, + }); + } + + for (const entry of capsule.sections.must_follow) { + addWarning(warnings, seen, warningFromEntry( + 'must_follow', + 'high', + entry, + 'Apply this must-follow rule before acting.', + )); + } + + for (const entry of capsule.sections.risks) { + addWarning(warnings, seen, warningFromEntry( + entry.memory_type === 'tool_failure' ? 'recent_failure' : 'risk', + entry.memory_type === 'tool_failure' ? 'medium' : 'high', + entry, + 'Mitigate this remembered risk before proceeding.', + )); + } + + for (const entry of capsule.sections.procedures) { + addWarning(warnings, seen, warningFromEntry( + 'procedure', + 'info', + entry, + 'Use this remembered procedure as guidance.', + )); + } + + for (const entry of capsule.sections.contradictions) { + addWarning(warnings, seen, warningFromEntry( + 'contradiction', + 'high', + entry, + 'Resolve or scope this contradiction before acting.', + )); + } + + for (const entry of capsule.sections.uncertain_or_disputed) { + addWarning(warnings, seen, warningFromEntry( + 'uncertain', + 'medium', + entry, + 'Treat this as uncertain context and verify before relying on it.', + )); + } + + warnings.sort((a, b) => SEVERITY_SCORE[b.severity] - SEVERITY_SCORE[a.severity]); + const riskScore = warnings.reduce((score, warning) => Math.max(score, SEVERITY_SCORE[warning.severity]), 0); + const hasHigh = warnings.some(w => w.severity === 'high'); + const hasMedium = warnings.some(w => w.severity === 'medium'); + const decision: PreflightDecision = options.strict && hasHigh + ? 'block' + : hasHigh || hasMedium + ? 'caution' + : 'go'; + const verdict = decision === 'go' ? 'clear' : decision === 'block' ? 'blocked' : 'caution'; + + const recommendedActions = [...new Set(warnings.map(recommendationFromWarning))]; + if (decision === 'block') { + recommendedActions.unshift('Do not proceed until the high-severity memory warning is addressed.'); + } + + const preflightEvent = options.recordEvent && options.tool + ? audrey.observeTool({ + event: 'PreToolUse', + tool: options.tool, + sessionId: options.sessionId, + input: { action: action.trim(), tool: options.tool }, + outcome: 'unknown', + cwd: options.cwd, + files: options.files, + metadata: { + preflight_decision: decision, + preflight_warning_count: warnings.length, + }, + }).event + : undefined; + + return { + action: action.trim(), + query, + tool: options.tool, + cwd: options.cwd, + generated_at: new Date().toISOString(), + decision, + verdict, + ok_to_proceed: decision !== 'block', + risk_score: Number(riskScore.toFixed(2)), + summary: buildSummary(decision, warnings), + warnings, + recent_failures: matchingFailures, + ...(status ? { status } : {}), + recommended_actions: recommendedActions, + evidence_ids: [...new Set([ + ...capsule.evidence_ids, + ...warnings.map(w => w.evidence_id).filter((id): id is string => Boolean(id)), + ])], + ...(preflightEvent ? { preflight_event_id: preflightEvent.id } : {}), + ...(options.includeCapsule === false ? {} : { capsule }), + }; +} diff --git a/src/reflexes.ts b/src/reflexes.ts new file mode 100644 index 0000000..3a4236d --- /dev/null +++ b/src/reflexes.ts @@ -0,0 +1,138 @@ +import { createHash } from 'node:crypto'; +import type { Audrey } from './audrey.js'; +import { + buildPreflight, + type MemoryPreflight, + type PreflightDecision, + type PreflightOptions, + type PreflightSeverity, + type PreflightWarning, + type PreflightWarningType, +} from './preflight.js'; + +export type ReflexResponseType = 'guide' | 'warn' | 'block'; + +export interface MemoryReflex { + id: string; + trigger: string; + response_type: ReflexResponseType; + severity: PreflightSeverity; + source: PreflightWarningType; + response: string; + reason: string; + evidence_id?: string; + action: string; + tool?: string; + cwd?: string; +} + +export interface ReflexOptions extends PreflightOptions { + includePreflight?: boolean; +} + +export interface MemoryReflexReport { + action: string; + query: string; + tool?: string; + cwd?: string; + generated_at: string; + decision: PreflightDecision; + risk_score: number; + summary: string; + reflexes: MemoryReflex[]; + evidence_ids: string[]; + recommended_actions: string[]; + preflight?: MemoryPreflight; +} + +function reflexId(warning: PreflightWarning, action: string, tool?: string): string { + const input = [ + warning.type, + warning.evidence_id ?? '', + warning.message, + action, + tool ?? '', + ].join('\n'); + return `reflex_${createHash('sha256').update(input).digest('hex').slice(0, 12)}`; +} + +function responseType(warning: PreflightWarning, decision: PreflightDecision): ReflexResponseType { + if (decision === 'block' && warning.severity === 'high') return 'block'; + if (warning.type === 'procedure' && warning.severity === 'info') return 'guide'; + return 'warn'; +} + +function triggerFor(warning: PreflightWarning, action: string, tool?: string): string { + if (warning.type === 'recent_failure' && tool) { + return `Before using ${tool}`; + } + if (warning.type === 'memory_health') { + return 'Before relying on Audrey recall'; + } + if (tool) { + return `Before ${tool}: ${action}`; + } + return `Before: ${action}`; +} + +function responseFor(warning: PreflightWarning): string { + if (warning.recommended_action && (warning.type === 'must_follow' || warning.type === 'risk')) { + return `${warning.recommended_action} ${warning.message}`; + } + return warning.recommended_action ?? warning.message; +} + +function summarizeReflexes(decision: PreflightDecision, reflexes: MemoryReflex[]): string { + if (reflexes.length === 0) { + return 'No active memory reflexes matched this action.'; + } + + const blocks = reflexes.filter(r => r.response_type === 'block').length; + const warnings = reflexes.filter(r => r.response_type === 'warn').length; + const guides = reflexes.filter(r => r.response_type === 'guide').length; + const parts = [`${reflexes.length} memory reflex${reflexes.length === 1 ? '' : 'es'}`]; + if (blocks > 0) parts.push(`${blocks} blocking`); + if (warnings > 0) parts.push(`${warnings} warning`); + if (guides > 0) parts.push(`${guides} guidance`); + return `${decision === 'block' ? 'Stop' : decision === 'caution' ? 'Slow down' : 'Proceed'}: ${parts.join(', ')} matched.`; +} + +export async function buildReflexReport( + audrey: Audrey, + action: string, + options: ReflexOptions = {}, +): Promise { + const preflight = await buildPreflight(audrey, action, { + ...options, + includeCapsule: options.includeCapsule ?? false, + }); + + const reflexes = preflight.warnings.map((warning): MemoryReflex => ({ + id: reflexId(warning, preflight.action, preflight.tool), + trigger: triggerFor(warning, preflight.action, preflight.tool), + response_type: responseType(warning, preflight.decision), + severity: warning.severity, + source: warning.type, + response: responseFor(warning), + reason: warning.reason, + ...(warning.evidence_id ? { evidence_id: warning.evidence_id } : {}), + action: preflight.action, + ...(preflight.tool ? { tool: preflight.tool } : {}), + ...(preflight.cwd ? { cwd: preflight.cwd } : {}), + })); + + return { + action: preflight.action, + query: preflight.query, + ...(preflight.tool ? { tool: preflight.tool } : {}), + ...(preflight.cwd ? { cwd: preflight.cwd } : {}), + generated_at: new Date().toISOString(), + decision: preflight.decision, + risk_score: preflight.risk_score, + summary: summarizeReflexes(preflight.decision, reflexes), + reflexes, + evidence_ids: preflight.evidence_ids, + recommended_actions: preflight.recommended_actions, + ...(options.includePreflight ? { preflight } : {}), + }; +} diff --git a/src/routes.ts b/src/routes.ts index fde0d2b..cf61693 100644 --- a/src/routes.ts +++ b/src/routes.ts @@ -1,122 +1,169 @@ import { Hono } from 'hono'; import type { Audrey } from './audrey.js'; +import type { PreflightOptions } from './preflight.js'; import { VERSION } from '../mcp-server/config.js'; - + export interface AppOptions { apiKey?: string; } -export function createApp(audrey: Audrey, options: AppOptions = {}): Hono { - const app = new Hono(); +type RouteBody = { + action?: string; + query?: string; + tool?: string; + session_id?: string; + sessionId?: string; + cwd?: string; + files?: string[]; + strict?: boolean; + limit?: number; + budget_chars?: number; + budgetChars?: number; + mode?: PreflightOptions['mode']; + failure_window_hours?: number; + recent_failure_window_hours?: number; + recentFailureWindowHours?: number; + recent_change_window_hours?: number; + recentChangeWindowHours?: number; + include_capsule?: boolean; + includeCapsule?: boolean; + include_status?: boolean; + includeStatus?: boolean; + record_event?: boolean; + recordEvent?: boolean; + include_preflight?: boolean; + includePreflight?: boolean; +}; - // Health check — no auth required. - // Fields kept for backward compatibility across Audrey client surfaces: - // status / healthy — original TS-era field names (tests/http-api.test.js) - // ok / version — Python SDK HealthResponse contract - // (python/audrey_memory/types.py) - app.get('/health', (c) => { - try { - const status = audrey.memoryStatus(); - return c.json({ - status: 'ok', - ok: true, - healthy: status.healthy, - version: VERSION, - }); - } catch { - return c.json({ - status: 'error', - ok: false, - healthy: false, - version: VERSION, - }, 500); - } - }); - - // API key middleware — only if apiKey is configured - if (options.apiKey) { - app.use('/v1/*', async (c, next) => { - const auth = c.req.header('Authorization'); - if (!auth || auth !== `Bearer ${options.apiKey}`) { - return c.json({ error: 'Unauthorized' }, 401); - } - await next(); - }); - } +function actionFromBody(body: RouteBody): unknown { + return body.action ?? body.query; +} - // POST /v1/encode - app.post('/v1/encode', async (c) => { - try { - const body = await c.req.json(); - const id = await audrey.encode({ - content: body.content, - source: body.source, - tags: body.tags, - salience: body.salience, - context: body.context, - affect: body.affect, - private: body.private, - }); - return c.json({ id, content: body.content, source: body.source, private: body.private ?? false }); - } catch (err: unknown) { - const message = err instanceof Error ? err.message : String(err); - return c.json({ error: message }, 400); - } - }); +function preflightOptionsFromBody(body: RouteBody): PreflightOptions { + return { + tool: body.tool, + sessionId: body.session_id ?? body.sessionId, + cwd: body.cwd, + files: body.files, + strict: body.strict, + limit: body.limit, + budgetChars: body.budget_chars ?? body.budgetChars, + mode: body.mode, + recentFailureWindowHours: body.failure_window_hours + ?? body.recent_failure_window_hours + ?? body.recentFailureWindowHours, + recentChangeWindowHours: body.recent_change_window_hours ?? body.recentChangeWindowHours, + includeCapsule: body.include_capsule ?? body.includeCapsule, + includeStatus: body.include_status ?? body.includeStatus, + recordEvent: body.record_event ?? body.recordEvent, + }; +} - // POST /v1/recall - app.post('/v1/recall', async (c) => { +export function createApp(audrey: Audrey, options: AppOptions = {}): Hono { + const app = new Hono(); + + // Health check - no auth required. + // Fields kept for backward compatibility across Audrey client surfaces: + // status / healthy - original TS-era field names (tests/http-api.test.js) + // ok / version - Python SDK HealthResponse contract + // (python/audrey_memory/types.py) + app.get('/health', (c) => { + try { + const status = audrey.memoryStatus(); + return c.json({ + status: 'ok', + ok: true, + healthy: status.healthy, + version: VERSION, + }); + } catch { + return c.json({ + status: 'error', + ok: false, + healthy: false, + version: VERSION, + }, 500); + } + }); + + // API key middleware - only if apiKey is configured + if (options.apiKey) { + app.use('/v1/*', async (c, next) => { + const auth = c.req.header('Authorization'); + if (!auth || auth !== `Bearer ${options.apiKey}`) { + return c.json({ error: 'Unauthorized' }, 401); + } + await next(); + }); + } + + // POST /v1/encode + app.post('/v1/encode', async (c) => { + try { + const body = await c.req.json(); + const id = await audrey.encode({ + content: body.content, + source: body.source, + tags: body.tags, + salience: body.salience, + context: body.context, + affect: body.affect, + private: body.private, + }); + return c.json({ id, content: body.content, source: body.source, private: body.private ?? false }); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 400); + } + }); + + // POST /v1/recall + app.post('/v1/recall', async (c) => { + try { + const body = await c.req.json(); + const { query, ...opts } = body; + const results = await audrey.recall(query, opts); + return c.json(results); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 400); + } + }); + + // POST /v1/capsule + app.post('/v1/capsule', async (c) => { + try { + const body = await c.req.json(); + if (typeof body.query !== 'string' || body.query.trim().length === 0) { + return c.json({ error: 'query must be a non-empty string' }, 400); + } + + const result = await audrey.capsule(body.query, { + limit: body.limit, + budgetChars: body.budget_chars ?? body.budgetChars, + mode: body.mode, + recentChangeWindowHours: body.recent_change_window_hours ?? body.recentChangeWindowHours, + includeRisks: body.include_risks ?? body.includeRisks, + includeContradictions: body.include_contradictions ?? body.includeContradictions, + recall: body.recall, + }); + return c.json(result); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 400); + } + }); + + // POST /v1/preflight + app.post('/v1/preflight', async (c) => { try { const body = await c.req.json(); - const { query, ...opts } = body; - const results = await audrey.recall(query, opts); - return c.json(results); - } catch (err: unknown) { - const message = err instanceof Error ? err.message : String(err); - return c.json({ error: message }, 400); - } - }); - - // POST /v1/consolidate - app.post('/v1/consolidate', async (c) => { - try { - const body = await c.req.json().catch(() => ({})); - const result = await audrey.consolidate(body); - return c.json(result); - } catch (err: unknown) { - const message = err instanceof Error ? err.message : String(err); - return c.json({ error: message }, 500); - } - }); - - // POST /v1/dream - app.post('/v1/dream', async (c) => { - try { - const body = await c.req.json().catch(() => ({})); - const result = await audrey.dream(body); - return c.json(result); - } catch (err: unknown) { - const message = err instanceof Error ? err.message : String(err); - return c.json({ error: message }, 500); - } - }); - - // GET /v1/introspect - app.get('/v1/introspect', (c) => { - try { - const result = audrey.introspect(); - return c.json(result); - } catch (err: unknown) { - const message = err instanceof Error ? err.message : String(err); - return c.json({ error: message }, 500); - } - }); + const action = actionFromBody(body); + if (typeof action !== 'string' || action.trim().length === 0) { + return c.json({ error: 'action must be a non-empty string' }, 400); + } - // POST /v1/resolve-truth - app.post('/v1/resolve-truth', async (c) => { - try { - const body = await c.req.json(); - const result = await audrey.resolveTruth(body.contradiction_id); + const result = await audrey.preflight(action, preflightOptionsFromBody(body)); return c.json(result); } catch (err: unknown) { const message = err instanceof Error ? err.message : String(err); @@ -124,111 +171,175 @@ export function createApp(audrey: Audrey, options: AppOptions = {}): Hono { } }); - // GET /v1/export - app.get('/v1/export', (c) => { - try { - const snapshot = audrey.export(); - return c.json(snapshot); - } catch (err: unknown) { - const message = err instanceof Error ? err.message : String(err); - return c.json({ error: message }, 500); - } - }); - - // POST /v1/import - app.post('/v1/import', async (c) => { - try { - const body = await c.req.json(); - await audrey.import(body.snapshot); - return c.json({ imported: true }); - } catch (err: unknown) { - const message = err instanceof Error ? err.message : String(err); - return c.json({ error: message }, 400); - } - }); - - // POST /v1/forget - app.post('/v1/forget', async (c) => { + // POST /v1/reflexes + app.post('/v1/reflexes', async (c) => { try { const body = await c.req.json(); - const hasId = 'id' in body && body.id; - const hasQuery = 'query' in body && body.query; - - if (hasId && hasQuery) { - return c.json({ error: 'Provide exactly one of "id" or "query", not both' }, 400); - } - if (!hasId && !hasQuery) { - return c.json({ error: 'Provide exactly one of "id" or "query"' }, 400); - } - - if (hasId) { - const result = audrey.forget(body.id, { purge: body.purge }); - return c.json(result); - } else { - const result = await audrey.forgetByQuery(body.query, { - minSimilarity: body.minSimilarity, - purge: body.purge, - }); - if (!result) { - return c.json({ error: 'No matching memory found' }, 404); - } - return c.json(result); + const action = actionFromBody(body); + if (typeof action !== 'string' || action.trim().length === 0) { + return c.json({ error: 'action must be a non-empty string' }, 400); } - } catch (err: unknown) { - const message = err instanceof Error ? err.message : String(err); - return c.json({ error: message }, 400); - } - }); - // POST /v1/decay - app.post('/v1/decay', async (c) => { - try { - const body = await c.req.json().catch(() => ({})); - const result = audrey.decay({ - dormantThreshold: (body as Record).dormantThreshold as number | undefined, - halfLives: (body as Record).halfLives as Record | undefined, + const result = await audrey.reflexes(action, { + ...preflightOptionsFromBody(body), + includePreflight: body.include_preflight ?? body.includePreflight, }); return c.json(result); } catch (err: unknown) { const message = err instanceof Error ? err.message : String(err); - return c.json({ error: message }, 500); - } - }); - - // GET /v1/status - app.get('/v1/status', (c) => { - try { - const result = audrey.memoryStatus(); - return c.json(result); - } catch (err: unknown) { - const message = err instanceof Error ? err.message : String(err); - return c.json({ error: message }, 500); - } - }); - - // POST /v1/reflect - app.post('/v1/reflect', async (c) => { - try { - const body = await c.req.json(); - const result = await audrey.reflect(body.turns); - return c.json(result); - } catch (err: unknown) { - const message = err instanceof Error ? err.message : String(err); - return c.json({ error: message }, 400); - } - }); - - // POST /v1/greeting - app.post('/v1/greeting', async (c) => { - try { - const body = await c.req.json().catch(() => ({})); - const result = await audrey.greeting({ context: (body as Record).context as string | undefined }); - return c.json(result); - } catch (err: unknown) { - const message = err instanceof Error ? err.message : String(err); - return c.json({ error: message }, 500); - } - }); - - return app; -} + return c.json({ error: message }, 400); + } + }); + + // POST /v1/consolidate + app.post('/v1/consolidate', async (c) => { + try { + const body = await c.req.json().catch(() => ({})); + const result = await audrey.consolidate(body); + return c.json(result); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 500); + } + }); + + // POST /v1/dream + app.post('/v1/dream', async (c) => { + try { + const body = await c.req.json().catch(() => ({})); + const result = await audrey.dream(body); + return c.json(result); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 500); + } + }); + + app.get('/v1/introspect', (c) => { + try { + const result = audrey.introspect(); + return c.json(result); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 500); + } + }); + + // POST /v1/resolve-truth + app.post('/v1/resolve-truth', async (c) => { + try { + const body = await c.req.json(); + const result = await audrey.resolveTruth(body.contradiction_id); + return c.json(result); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 400); + } + }); + + app.get('/v1/export', (c) => { + try { + const snapshot = audrey.export(); + return c.json(snapshot); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 500); + } + }); + + // POST /v1/import + app.post('/v1/import', async (c) => { + try { + const body = await c.req.json(); + await audrey.import(body.snapshot); + return c.json({ imported: true }); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 400); + } + }); + + // POST /v1/forget + app.post('/v1/forget', async (c) => { + try { + const body = await c.req.json(); + const hasId = 'id' in body && body.id; + const hasQuery = 'query' in body && body.query; + + if (hasId && hasQuery) { + return c.json({ error: 'Provide exactly one of "id" or "query", not both' }, 400); + } + if (!hasId && !hasQuery) { + return c.json({ error: 'Provide exactly one of "id" or "query"' }, 400); + } + + if (hasId) { + const result = audrey.forget(body.id, { purge: body.purge }); + return c.json(result); + } else { + const result = await audrey.forgetByQuery(body.query, { + minSimilarity: body.minSimilarity, + purge: body.purge, + }); + if (!result) { + return c.json({ error: 'No matching memory found' }, 404); + } + return c.json(result); + } + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 400); + } + }); + + // POST /v1/decay + app.post('/v1/decay', async (c) => { + try { + const body = await c.req.json().catch(() => ({})); + const result = audrey.decay({ + dormantThreshold: (body as Record).dormantThreshold as number | undefined, + halfLives: (body as Record).halfLives as Record | undefined, + }); + return c.json(result); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 500); + } + }); + + app.get('/v1/status', (c) => { + try { + const result = audrey.memoryStatus(); + return c.json(result); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 500); + } + }); + + // POST /v1/reflect + app.post('/v1/reflect', async (c) => { + try { + const body = await c.req.json(); + const result = await audrey.reflect(body.turns); + return c.json(result); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 400); + } + }); + + // POST /v1/greeting + app.post('/v1/greeting', async (c) => { + try { + const body = await c.req.json().catch(() => ({})); + const result = await audrey.greeting({ context: (body as Record).context as string | undefined }); + return c.json(result); + } catch (err: unknown) { + const message = err instanceof Error ? err.message : String(err); + return c.json({ error: message }, 500); + } + }); + + return app; +} diff --git a/tests/http-api.test.js b/tests/http-api.test.js index 160e023..5077833 100644 --- a/tests/http-api.test.js +++ b/tests/http-api.test.js @@ -72,6 +72,83 @@ describe('HTTP API', () => { expect(body[0].content).toContain('SQLite'); }); + it('POST /v1/capsule returns a structured memory packet', async () => { + await app.request('/v1/encode', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + content: 'Before editing Audrey host docs, keep Codex and Ollama as first-class targets', + source: 'direct-observation', + tags: ['procedure', 'codex', 'ollama'], + }), + }); + + const res = await app.request('/v1/capsule', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ query: 'Audrey Codex Ollama host docs', budget_chars: 2000 }), + }); + expect(res.status).toBe(200); + const body = await res.json(); + expect(body.query).toBe('Audrey Codex Ollama host docs'); + expect(body.sections).toHaveProperty('procedures'); + expect(Array.isArray(body.evidence_ids)).toBe(true); + }); + + it('POST /v1/preflight checks memory before an action', async () => { + audrey.observeTool({ + event: 'PostToolUse', + tool: 'npm test', + outcome: 'failed', + errorSummary: 'Vitest failed with spawn EPERM on this Windows host', + cwd: process.cwd(), + }); + + const res = await app.request('/v1/preflight', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + action: 'run npm test before release', + tool: 'npm test', + record_event: true, + include_capsule: false, + }), + }); + expect(res.status).toBe(200); + const body = await res.json(); + expect(body.action).toBe('run npm test before release'); + expect(body.decision).toBe('caution'); + expect(body.warnings.some(w => w.type === 'recent_failure')).toBe(true); + expect(body.preflight_event_id).toMatch(/^01/); + expect(body.capsule).toBeUndefined(); + }); + + it('POST /v1/reflexes returns trigger-response memory reflexes', async () => { + audrey.observeTool({ + event: 'PostToolUse', + tool: 'npm test', + outcome: 'failed', + errorSummary: 'Vitest failed with spawn EPERM on this Windows host', + cwd: process.cwd(), + }); + + const res = await app.request('/v1/reflexes', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + action: 'run npm test before release', + tool: 'npm test', + include_preflight: true, + }), + }); + expect(res.status).toBe(200); + const body = await res.json(); + expect(body.decision).toBe('caution'); + expect(body.reflexes[0].trigger).toBe('Before using npm test'); + expect(body.reflexes[0].response_type).toBe('warn'); + expect(body.preflight.decision).toBe('caution'); + }); + it('POST /v1/dream runs full cycle', async () => { const res = await app.request('/v1/dream', { method: 'POST', diff --git a/tests/mcp-server.test.js b/tests/mcp-server.test.js index 57da87a..38bcd09 100644 --- a/tests/mcp-server.test.js +++ b/tests/mcp-server.test.js @@ -3,7 +3,17 @@ import { z } from 'zod'; import { EventEmitter } from 'node:events'; import { Audrey } from '../dist/src/index.js'; import { readStoredDimensions } from '../dist/src/db.js'; -import { buildAudreyConfig, buildInstallArgs, DEFAULT_DATA_DIR, MCP_ENTRYPOINT, SERVER_NAME, VERSION } from '../dist/mcp-server/config.js'; +import { + buildAudreyConfig, + buildInstallArgs, + buildStdioMcpServerConfig, + DEFAULT_AGENT, + DEFAULT_DATA_DIR, + formatMcpHostConfig, + MCP_ENTRYPOINT, + SERVER_NAME, + VERSION, +} from '../dist/mcp-server/config.js'; import { MAX_MEMORY_CONTENT_LENGTH, buildStatusReport, @@ -12,9 +22,12 @@ import { memoryEncodeToolSchema, memoryForgetToolSchema, memoryImportToolSchema, + memoryPreflightToolSchema, memoryRecallToolSchema, + memoryReflexesToolSchema, registerShutdownHandlers, registerDreamTool, + runDemoCommand, runStatusCommand, validateForgetSelection, } from '../dist/mcp-server/index.js'; @@ -54,7 +67,7 @@ describe('MCP CLI: buildAudreyConfig', () => { it('uses defaults when no env vars set', () => { const config = buildAudreyConfig(); expect(config.dataDir).toBe(DEFAULT_DATA_DIR); - expect(config.agent).toBe('claude-code'); + expect(config.agent).toBe(DEFAULT_AGENT); expect(config.embedding.provider).toBe('local'); expect(config.embedding.dimensions).toBe(384); expect(config.llm).toBeUndefined(); @@ -179,6 +192,59 @@ describe('MCP CLI: buildInstallArgs', () => { const firstEnvIdx = args.indexOf('-e'); expect(nameIdx).toBeLessThan(firstEnvIdx); }); + + it('keeps claude-code as the agent name for the Claude CLI installer', () => { + const args = buildInstallArgs({}); + const envPairsStr = args.filter((_, i) => args[i - 1] === '-e').join(' '); + expect(envPairsStr).toContain('AUDREY_AGENT=claude-code'); + }); +}); + +describe('MCP CLI: host-neutral config output', () => { + it('builds a generic stdio config with the local-agent default', () => { + const config = buildStdioMcpServerConfig({}); + expect(config.command).toBe(process.execPath); + expect(config.args).toEqual([MCP_ENTRYPOINT]); + expect(config.env.AUDREY_AGENT).toBe(DEFAULT_AGENT); + expect(config.env.AUDREY_EMBEDDING_PROVIDER).toBe('local'); + }); + + it('formats Codex TOML with a codex agent identity', () => { + const text = formatMcpHostConfig('codex', {}); + expect(text).toContain(`[mcp_servers.${SERVER_NAME}]`); + expect(text).toContain('AUDREY_AGENT = "codex"'); + expect(text).toContain('AUDREY_EMBEDDING_PROVIDER = "local"'); + }); + + it('formats VS Code MCP JSON using the servers envelope', () => { + const text = formatMcpHostConfig('vscode', {}); + const parsed = JSON.parse(text); + expect(parsed.servers[SERVER_NAME].type).toBe('stdio'); + expect(parsed.servers[SERVER_NAME].env.AUDREY_AGENT).toBe('vscode-copilot'); + }); + + it('does not print provider secrets in generated host configs', () => { + const text = formatMcpHostConfig('codex', { + ANTHROPIC_API_KEY: 'sk-ant-secret', + OPENAI_API_KEY: 'sk-openai-secret', + }); + expect(text).not.toContain('sk-ant-secret'); + expect(text).not.toContain('sk-openai-secret'); + expect(text).not.toContain('ANTHROPIC_API_KEY'); + expect(text).not.toContain('OPENAI_API_KEY'); + }); +}); + +describe('MCP CLI: demo command', () => { + it('prints a self-contained memory demo without external services', async () => { + const lines = []; + await runDemoCommand({ out: (...args) => lines.push(args.join(' ')) }); + const output = lines.join('\n'); + expect(output).toContain('Audrey 60-second memory demo'); + expect(output).toContain('Capsule highlights:'); + expect(output).toContain('Recall proof:'); + expect(output).toContain('npx audrey mcp-config codex'); + }); }); describe('MCP validation hardening', () => { @@ -210,6 +276,31 @@ describe('MCP validation hardening', () => { expect(schema.safeParse({ query: 'test', limit: 50 }).success).toBe(true); }); + it('memory_preflight rejects empty actions and accepts strict risk checks', () => { + const schema = z.object(memoryPreflightToolSchema); + expect(schema.safeParse({ action: '', tool: 'Bash' }).success).toBe(false); + expect(schema.safeParse({ + action: 'run npm test', + tool: 'npm test', + strict: true, + failure_window_hours: 24, + record_event: true, + include_capsule: false, + }).success).toBe(true); + }); + + it('memory_reflexes accepts preflight inputs plus include_preflight', () => { + const schema = z.object(memoryReflexesToolSchema); + expect(schema.safeParse({ action: '', tool: 'Bash' }).success).toBe(false); + expect(schema.safeParse({ + action: 'deploy Audrey', + tool: 'deploy', + strict: true, + include_preflight: true, + include_capsule: false, + }).success).toBe(true); + }); + it('memory_import accepts consolidationMetrics snapshots', () => { const schema = z.object(memoryImportToolSchema); expect(schema.safeParse({ @@ -888,5 +979,3 @@ describe('MCP tool: memory_status', () => { expect(status.healthy).toBe(false); }); }); - - diff --git a/tests/preflight.test.js b/tests/preflight.test.js new file mode 100644 index 0000000..748ccf0 --- /dev/null +++ b/tests/preflight.test.js @@ -0,0 +1,92 @@ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { existsSync, rmSync, mkdirSync } from 'node:fs'; +import { Audrey } from '../dist/src/index.js'; + +const TEST_DIR = './test-preflight-data'; + +describe('Memory Preflight', () => { + let audrey; + + beforeEach(() => { + if (existsSync(TEST_DIR)) rmSync(TEST_DIR, { recursive: true }); + mkdirSync(TEST_DIR, { recursive: true }); + audrey = new Audrey({ + dataDir: TEST_DIR, + agent: 'preflight-test', + embedding: { provider: 'mock', dimensions: 8 }, + }); + }); + + afterEach(() => { + audrey.close(); + if (existsSync(TEST_DIR)) rmSync(TEST_DIR, { recursive: true }); + }); + + it('returns go when there are no relevant memory warnings', async () => { + const result = await audrey.preflight('format the docs', { + includeCapsule: false, + }); + + expect(result.decision).toBe('go'); + expect(result.warnings).toEqual([]); + expect(result.risk_score).toBe(0); + expect(result.capsule).toBeUndefined(); + }); + + it('warns before repeating a known failed tool action', async () => { + audrey.observeTool({ + event: 'PostToolUse', + tool: 'npm test', + outcome: 'failed', + errorSummary: 'Vitest failed with spawn EPERM on this Windows host', + cwd: process.cwd(), + }); + + const result = await audrey.preflight('run npm test before release', { + tool: 'npm test', + strict: true, + includeCapsule: false, + }); + + expect(result.decision).toBe('caution'); + expect(result.risk_score).toBeGreaterThan(0); + expect(result.warnings.some(w => w.type === 'recent_failure')).toBe(true); + expect(result.warnings.map(w => w.message).join('\n')).toMatch(/spawn EPERM|failed/i); + expect(result.recent_failures).toHaveLength(1); + expect(result.status.healthy).toBe(true); + expect(result.recommended_actions.length).toBeGreaterThan(0); + }); + + it('blocks in strict mode when a must-follow memory is relevant', async () => { + await audrey.encode({ + content: 'Never publish Audrey without running npm pack --dry-run first.', + source: 'direct-observation', + tags: ['must-follow', 'release'], + }); + + const result = await audrey.preflight('publish Audrey release', { + strict: true, + includeCapsule: false, + }); + + expect(result.decision).toBe('block'); + expect(result.warnings[0].severity).toBe('high'); + expect(result.warnings.some(w => w.type === 'must_follow')).toBe(true); + expect(result.recommended_actions[0]).toMatch(/Do not proceed/); + }); + + it('can record a redacted PreToolUse event for the preflight check', async () => { + const result = await audrey.preflight('edit the release notes', { + tool: 'Edit', + sessionId: 'session-1', + recordEvent: true, + includeCapsule: false, + }); + + expect(result.preflight_event_id).toMatch(/^01/); + const events = audrey.listEvents({ eventType: 'PreToolUse', toolName: 'Edit' }); + expect(events).toHaveLength(1); + expect(events[0].session_id).toBe('session-1'); + expect(events[0].input_hash).toMatch(/^[a-f0-9]{64}$/); + }); +}); diff --git a/tests/reflexes.test.js b/tests/reflexes.test.js new file mode 100644 index 0000000..b610cd9 --- /dev/null +++ b/tests/reflexes.test.js @@ -0,0 +1,64 @@ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { existsSync, rmSync, mkdirSync } from 'node:fs'; +import { Audrey } from '../dist/src/index.js'; + +const TEST_DIR = './test-reflexes-data'; + +describe('Memory Reflexes', () => { + let audrey; + + beforeEach(() => { + if (existsSync(TEST_DIR)) rmSync(TEST_DIR, { recursive: true }); + mkdirSync(TEST_DIR, { recursive: true }); + audrey = new Audrey({ + dataDir: TEST_DIR, + agent: 'reflex-test', + embedding: { provider: 'mock', dimensions: 8 }, + }); + }); + + afterEach(() => { + audrey.close(); + if (existsSync(TEST_DIR)) rmSync(TEST_DIR, { recursive: true }); + }); + + it('turns a repeated tool failure into a warning reflex', async () => { + audrey.observeTool({ + event: 'PostToolUse', + tool: 'npm test', + outcome: 'failed', + errorSummary: 'Vitest failed with spawn EPERM on this Windows host', + cwd: process.cwd(), + }); + + const report = await audrey.reflexes('run npm test before release', { + tool: 'npm test', + }); + + expect(report.decision).toBe('caution'); + expect(report.summary).toMatch(/memory reflex/i); + expect(report.reflexes).toHaveLength(1); + expect(report.reflexes[0].response_type).toBe('warn'); + expect(report.reflexes[0].trigger).toBe('Before using npm test'); + expect(report.reflexes[0].source).toBe('recent_failure'); + expect(report.preflight).toBeUndefined(); + }); + + it('can include the underlying preflight report for explainability', async () => { + await audrey.encode({ + content: 'Never deploy Audrey without checking the package tarball first.', + source: 'direct-observation', + tags: ['must-follow', 'release'], + }); + + const report = await audrey.reflexes('deploy Audrey release', { + strict: true, + includePreflight: true, + }); + + expect(report.decision).toBe('block'); + expect(report.reflexes.some(r => r.response_type === 'block')).toBe(true); + expect(report.preflight.decision).toBe('block'); + expect(report.evidence_ids.length).toBeGreaterThan(0); + }); +});