diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..8a24597 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,15 @@ +# Changelog + +## 0.21.0 - Release Diagnostics and Host Setup + +- Added `npx audrey doctor` for first-contact diagnostics, JSON automation, provider checks, MCP entrypoint validation, memory-store health, and host config generation. +- Added `npx audrey install --host --dry-run` so Codex, Claude Code, Claude Desktop, Cursor, Windsurf, VS Code, JetBrains, and generic MCP hosts can preview setup without accidental config writes. +- Updated docs around the recommended first run: `doctor`, `demo`, safe host install preview, then host-specific verification. +- Kept Claude Code's direct installer intact while making the default release story host-neutral. +- Refreshed lockfile transitive packages through the npm resolver; vulnerability audit remains clean. + +## 0.20.0 - Memory Reflexes + +- Added Memory Preflight and Memory Reflexes so agents can check memory before acting and turn repeated failures into trigger-response guidance. +- Added Ollama/local-agent guidance and runnable local-agent example. +- Expanded host-neutral MCP docs and Audrey for Dummies onboarding. diff --git a/README.md b/README.md index 52ad037..343e726 100644 --- a/README.md +++ b/README.md @@ -1,56 +1,96 @@ -# Audrey +
+ Audrey wordmark -[![CI](https://github.com/Evilander/Audrey/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/Evilander/Audrey/actions/workflows/ci.yml) -[![npm version](https://img.shields.io/npm/v/audrey.svg)](https://www.npmjs.com/package/audrey) -[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE) +

The local-first memory control plane for AI agents.

-Audrey is a local-first memory runtime and continuity engine for AI agents. +

+ Give Codex, Claude Code, Claude Desktop, Cursor, Windsurf, VS Code, JetBrains, Ollama-backed agents, + and custom agent services one durable memory layer they can check before they act. +

-It gives Codex, Claude Code, Claude Desktop, Cursor, local Ollama-backed agents, and custom agent services a shared local memory store, durable recall, consolidation, contradiction handling, a REST sidecar, MCP tools, and benchmark gates without adding external infrastructure. +

+ CI + npm version + MIT license +

+
-Audrey also checks memory before an agent acts. Known failures, project rules, and local quirks become preflight warnings and Memory Reflexes instead of repeated mistakes. +## Why Audrey Exists -Requires Node.js 20+. +Agents forget the exact mistakes they made yesterday. They repeat broken commands, lose project-specific rules, miss contradictions, and treat every new session like a cold start. + +Audrey turns those hard-won lessons into a local memory runtime: + +- `memory_recall` finds durable context by semantic similarity. +- `memory_preflight` checks prior failures, risks, rules, and relevant procedures before an action. +- `memory_reflexes` converts remembered evidence into trigger-response guidance agents can follow. +- `memory_dream` consolidates episodes into principles and applies decay. +- `audrey doctor` tells a human or CI system whether the runtime is actually ready. + +It is not a hosted vector database, a notes app, or a Claude-only plugin. Audrey is a SQLite-backed continuity layer that can sit under any local or sidecar agent loop. + +
+ Audrey feature marks: memory continuity, archive signal, recall loop, layered evidence, local node, and remembering before acting +
## Quick Start -### 60-Second Proof +Requires Node.js 20+. ```bash +npx audrey doctor npx audrey demo ``` -This runs a self-contained local demo with no API keys, no host setup, and no external model. It writes temporary memories, records a redacted tool failure, asks Audrey for a Memory Capsule, proves recall, then deletes the demo store. +`doctor` verifies Node, the MCP entrypoint, provider selection, memory-store health, and host config generation. `demo` runs a no-key, no-host, no-network proof: it creates temporary memories, records a redacted failed tool trace, generates a Memory Capsule, proves recall, prints Memory Reflexes, and deletes the demo store. + +Expected first-run shape: + +```text +Audrey Doctor v0.21.0 +Store health: not initialized +Verdict: ready +``` + +After the first real memory write, `doctor` should report the store as healthy. -### MCP Hosts +## Install Into Agent Hosts + +Preview host setup without editing config files: + +```bash +npx audrey install --host codex --dry-run +npx audrey install --host claude-code --dry-run +npx audrey install --host generic --dry-run +``` + +Generate raw config blocks: ```bash npx audrey mcp-config codex npx audrey mcp-config generic +npx audrey mcp-config vscode ``` -`mcp-config codex` prints a ready-to-paste Codex TOML block. `mcp-config generic` prints JSON for local stdio MCP hosts such as Claude Desktop, Cursor, Windsurf, and JetBrains. - -Claude Code also has a direct installer: +Claude Code can be registered directly: ```bash npx audrey install claude mcp list ``` -All MCP paths use local embeddings by default and store memory in one SQLite-backed data directory. +All local MCP paths default to local embeddings and one shared SQLite-backed memory directory. Use `AUDREY_DATA_DIR` to isolate projects, tenants, or host identities. -### Ollama and Local Agents +## Use With Ollama And Local Agents -Ollama is a local model runtime, not a memory store. Use Audrey as the sidecar memory tool layer for any Ollama-backed agent: +Ollama runs models; Audrey supplies memory. Start Audrey as a local REST sidecar and expose its routes as tools in your agent loop: ```bash AUDREY_AGENT=ollama-local-agent npx audrey serve curl http://localhost:7437/health +curl http://localhost:7437/v1/status ``` -Then expose Audrey's `/v1/preflight`, `/v1/reflexes`, `/v1/encode`, `/v1/recall`, `/v1/capsule`, and `/v1/status` routes as tools in the local agent loop. - Runnable example: ```bash @@ -58,57 +98,43 @@ AUDREY_AGENT=ollama-local-agent npx audrey serve OLLAMA_MODEL=qwen3 node examples/ollama-memory-agent.js "What should you remember about Audrey?" ``` -### REST or Docker Sidecar +Core sidecar tools: -```bash -docker compose up -d --build -``` - -Then verify: +| Agent Need | REST Route | +|---|---| +| Check memory before acting | `POST /v1/preflight` | +| Get reflex rules for an action | `POST /v1/reflexes` | +| Store a useful observation | `POST /v1/encode` | +| Recall relevant context | `POST /v1/recall` | +| Get a turn-sized memory packet | `POST /v1/capsule` | +| Check health | `GET /v1/status` | -```bash -npx audrey status -curl http://localhost:3487/health -``` +## What Ships -## Why Audrey +| Surface | Status | +|---|---| +| MCP stdio server | 19 tools, resources, and prompt templates | +| CLI | `doctor`, `demo`, `install`, `mcp-config`, `status`, `dream`, `reembed`, `observe-tool`, `promote` | +| REST API | Hono server with `/health`, `/openapi.json`, `/docs`, and `/v1/*` routes | +| JavaScript SDK | Direct TypeScript/Node import from `audrey` | +| Python client | `pip install audrey-memory`, calls the REST sidecar | +| Storage | Local SQLite plus `sqlite-vec`, no hosted database required | +| Deployment | npm package, Docker, Compose, host-specific MCP config generation | +| Safety loop | preflight warnings, reflexes, redacted tool traces, contradiction handling | -- Local-first: memory lives in SQLite with `sqlite-vec`, not a hosted vector database. -- Host-neutral: Audrey is a memory runtime for agent hosts, not a Claude-only extension. -- Practical: MCP, CLI, REST, JavaScript, Python, and Docker are all first-class. -- Durable: export/import, health checks, benchmark gates, and graceful shutdown are built in. -- Structured: Audrey does more than save notes. It consolidates, decays, tracks contradictions, and supports procedural memory. +## Memory Model -## What Ships +Audrey is built around the parts of memory that matter for agents: -- Local stdio MCP server with 19 memory tools -- Ready-to-paste config generation for Codex and generic MCP hosts -- Hook-compatible CLI helpers for recall, reflection, and tool trace capture -- JavaScript SDK -- Python SDK packaged as `audrey-memory` -- REST API for sidecar deployment and Ollama/local-agent tool bridges -- Memory Preflight for checking prior failures, risks, rules, and procedures before an agent acts -- Memory Reflexes that convert preflight evidence into trigger-response guidance agents can automate -- Docker and Compose deployment path -- Export/import for portable memory state -- Machine-readable health and benchmark gates -- Local benchmark harness with retrieval and lifecycle-operation tracks - -## Integration Modes - -| Mode | Best For | Entry Point | -|---|---|---| -| MCP stdio | Codex, Claude Code, Claude Desktop, Cursor, Windsurf, VS Code, JetBrains | `npx audrey mcp-config ` or `npx audrey install` for Claude Code | -| REST sidecar | Ollama-backed local agents, internal agent services, Docker | `npx audrey serve` or `docker compose up -d --build` | -| SDK direct | Node.js and TypeScript agents inside one process | `import { Audrey } from 'audrey'` | -| Python client | Python agents calling the REST sidecar | `pip install audrey-memory` | - -Useful checks: +- Episodic memory: specific observations, tool results, preferences, and session facts. +- Semantic memory: consolidated principles extracted from repeated evidence. +- Procedural memory: remembered ways to act, avoid, retry, or verify. +- Affect and salience: emotional weight and importance influence recall. +- Interference and decay: stale, conflicting, or low-confidence memories lose authority over time. +- Contradiction handling: competing claims are tracked instead of silently overwritten. +- Tool-trace learning: failed commands and risky actions become future preflight warnings. -```bash -npx audrey status -npx audrey status --json --fail-on-unhealthy -``` +The product bet is simple: the next generation of useful agents will not just retrieve facts. They will remember what happened, decide whether a memory is still trustworthy, and use that memory before touching tools. ## Use Audrey From Code @@ -144,45 +170,46 @@ pip install audrey-memory ```python from audrey_memory import Audrey -brain = Audrey( - base_url="http://127.0.0.1:3487", - api_key="secret", - agent="support-agent", -) - -memory_id = brain.encode( - "Stripe returns HTTP 429 above 100 req/s", - source="direct-observation", -) +brain = Audrey(base_url="http://127.0.0.1:7437", agent="support-agent") +memory_id = brain.encode("Stripe returns HTTP 429 above 100 req/s", source="direct-observation") results = brain.recall("stripe rate limit", limit=5) brain.close() ``` -## Key Commands +## Production Readiness + +Audrey is close to a 1.0-ready local memory runtime, but production depends on how it is embedded. Treat it like stateful infrastructure. + +Release gates used for this package: ```bash -# Setup +npm run build +npm run typecheck +npm run bench:memory:check +npm pack --dry-run +npx audrey doctor npx audrey demo -npx audrey mcp-config codex -npx audrey mcp-config generic - -# MCP integration -npx audrey install -npx audrey uninstall +``` -# Health and maintenance -npx audrey status -npx audrey dream -npx audrey reembed -npx audrey observe-tool --event PostToolUse --tool Bash --outcome failed +Recommended runtime checks: -# Sidecar -npx audrey serve -node examples/ollama-memory-agent.js "Use Audrey memory before answering" -docker compose up -d --build +```bash +npx audrey doctor --json +npx audrey status --json --fail-on-unhealthy +npx audrey install --host codex --dry-run ``` -Before risky actions, hosts can call `memory_preflight` or `memory_reflexes` over MCP, or `POST /v1/preflight` / `POST /v1/reflexes` over REST. Preflight returns the risk briefing. Reflexes return trigger-response rules such as "Before using npm test, review the prior EPERM failure path." +Production controls you still own: + +- Set one `AUDREY_DATA_DIR` per tenant, environment, or isolation boundary. +- Pin `AUDREY_EMBEDDING_PROVIDER` and `AUDREY_LLM_PROVIDER` explicitly. +- Back up the SQLite data directory before provider or dimension changes. +- Keep API keys and raw credentials out of encoded memory content. +- Use `AUDREY_API_KEY` if the REST sidecar is reachable beyond the local process boundary. +- Run `npx audrey dream` on a schedule so consolidation and decay stay current. +- Add application-level encryption, retention, access control, and audit logging for regulated environments. + +Read the full guide: [docs/production-readiness.md](docs/production-readiness.md). ## Benchmarks @@ -193,63 +220,57 @@ npm run bench:memory npm run bench:memory:check ``` -The benchmark suite measures: - -- retrieval behavior -- update and overwrite behavior -- delete and abstain behavior -- semantic and procedural merge behavior - Current repo snapshot: ![Audrey local benchmark](docs/assets/benchmarks/local-benchmark.svg) -For detailed methodology, published comparison anchors, and generated reports, see [docs/benchmarking.md](docs/benchmarking.md). - -## Production - -Audrey is strongest in workflows where memory must stay local, reviewable, and durable. It already fits well as a sidecar for internal agents in operational domains like financial services and healthcare operations, but it is a memory layer, not a compliance boundary. +The benchmark suite covers retrieval behavior, overwrite behavior, delete/abstain behavior, and semantic/procedural merge behavior. For methodology and comparison anchors, see [docs/benchmarking.md](docs/benchmarking.md). -Production guide: [docs/production-readiness.md](docs/production-readiness.md) +## Command Reference -Examples: - -- [examples/fintech-ops-demo.js](examples/fintech-ops-demo.js) -- [examples/healthcare-ops-demo.js](examples/healthcare-ops-demo.js) -- [examples/stripe-demo.js](examples/stripe-demo.js) - -## Environment +```bash +# First contact +npx audrey doctor +npx audrey demo -Starter config: +# MCP setup +npx audrey install --host codex --dry-run +npx audrey mcp-config codex +npx audrey mcp-config generic +npx audrey install +npx audrey uninstall -- [.env.example](.env.example) -- [.env.docker.example](.env.docker.example) +# Health and maintenance +npx audrey status +npx audrey status --json --fail-on-unhealthy +npx audrey dream +npx audrey reembed -Key environment variables: +# Tool-trace learning +npx audrey observe-tool --event PostToolUse --tool Bash --outcome failed +npx audrey promote --dry-run -- `AUDREY_DATA_DIR` -- `AUDREY_EMBEDDING_PROVIDER` -- `AUDREY_LLM_PROVIDER` -- `AUDREY_DEVICE` -- `AUDREY_API_KEY` -- `AUDREY_HOST` -- `AUDREY_PORT` +# REST sidecar +npx audrey serve +docker compose up -d --build +``` ## Documentation -- [docs/benchmarking.md](docs/benchmarking.md) -- [docs/audrey-for-dummies.md](docs/audrey-for-dummies.md) -- [docs/future-of-llm-memory.md](docs/future-of-llm-memory.md) -- [docs/production-readiness.md](docs/production-readiness.md) -- [docs/mcp-hosts.md](docs/mcp-hosts.md) -- [docs/ollama-local-agents.md](docs/ollama-local-agents.md) -- [CONTRIBUTING.md](CONTRIBUTING.md) -- [SECURITY.md](SECURITY.md) +- [Audrey for Dummies](docs/audrey-for-dummies.md) +- [MCP host guide](docs/mcp-hosts.md) +- [Ollama and local agents](docs/ollama-local-agents.md) +- [Production readiness](docs/production-readiness.md) +- [Future of LLM memory](docs/future-of-llm-memory.md) +- [Benchmarking](docs/benchmarking.md) +- [Security policy](SECURITY.md) ## Development ```bash npm ci +npm run build +npm run typecheck npm test npm run bench:memory:check npm run pack:check @@ -257,6 +278,8 @@ python -m unittest discover -s python/tests -v python -m build --no-isolation python ``` +On some locked-down Windows hosts, Vitest/Vite can fail before tests start with `spawn EPERM`. That is an environment process-spawn blocker, not an Audrey runtime failure. Use build, typecheck, benchmark, pack dry-run, direct `dist/` smokes, and GitHub Actions as the release evidence path. + ## License MIT. See [LICENSE](LICENSE). diff --git a/codex.md b/codex.md index 5cf7ffa..cd38355 100644 --- a/codex.md +++ b/codex.md @@ -4,7 +4,7 @@ ## What Audrey Is -Audrey is a **biological memory system and local-first continuity runtime for AI agents**. It gives Codex, Claude Code, Claude Desktop, Ollama-backed local agents, and custom agent services persistent local memory that encodes, consolidates, decays, and dreams - modeled after how human brains actually process memory. Published on npm as `audrey` (v0.20.0) and PyPI as `audrey-memory` (v0.20.0). +Audrey is a **biological memory system and local-first continuity runtime for AI agents**. It gives Codex, Claude Code, Claude Desktop, Ollama-backed local agents, and custom agent services persistent local memory that encodes, consolidates, decays, and dreams - modeled after how human brains actually process memory. This release targets npm package `audrey` v0.21.0; the Python client is separately versioned under `python/`. **Not a database.** Not a RAG pipeline. Not a vector store. Audrey is a *memory layer* with biological fidelity: episodic memories consolidate into semantic principles, confidence decays over time, contradictions are tracked and resolved, emotional affect influences recall, and interference between competing memories is modeled explicitly. @@ -68,7 +68,7 @@ audrey/ ├── mcp-server/ # MCP server + CLI (2 modules) │ ├── index.ts # 19 MCP tools + CLI (install/uninstall/status/greeting/reflect/dream/reembed/serve) │ └── config.ts # Provider resolution, VERSION constant, install args -├── python-sdk/ # Python SDK (pip install audrey-memory) +├── python/ # Python SDK (pip install audrey-memory) │ ├── pyproject.toml # Hatchling build, deps: httpx + pydantic │ ├── src/audrey_memory/ │ │ ├── __init__.py # Public API exports @@ -99,7 +99,7 @@ audrey/ ├── .github/workflows/ci.yml # CI: Node 18/20/22 Ubuntu + Windows smoke ├── tsconfig.json # Strict TS, Node16 module resolution, outDir: ./dist ├── vitest.config.js # Test config (excludes stale dirs) -├── package.json # v0.20.0, ES modules, exports: . + ./mcp + ./server +├── package.json # v0.21.0, ES modules, exports: . + ./mcp + ./server └── codex.md # This file ``` @@ -137,6 +137,8 @@ brain.close(); ```bash npx audrey demo # self-contained local proof, no keys or host setup +npx audrey doctor # first-contact diagnostics and release-gate JSON +npx audrey install --host codex --dry-run # safe host setup preview npx audrey mcp-config codex # prints ready-to-paste Codex TOML npx audrey mcp-config generic # prints JSON for stdio MCP hosts npx audrey install # registers MCP server with Claude Code @@ -206,7 +208,7 @@ npm run bench:memory:check npm run pack:check # Python SDK tests (separate) -cd python-sdk +cd python pip install -e ".[dev]" pytest -m "not integration" -v ``` @@ -317,7 +319,7 @@ Auto-detection priority: `GOOGLE_API_KEY` → Gemini embeddings; `ANTHROPIC_API_ These are from the approved roadmap in `docs/superpowers/specs/2026-04-10-audrey-industry-standard-design.md`. -### v0.21: LoCoMo Benchmark Adapter (HIGH PRIORITY) +### v0.22: LoCoMo Benchmark Adapter (HIGH PRIORITY) **Why:** Audrey currently has an internal benchmark (100% score, 43.8 points ahead of baselines). But there's no direct reproduction of the LoCoMo benchmark protocol, which is what Mem0 (66.9), Letta (74.0), and MIRIX (85.4) report against. Publishing a LoCoMo number is the single biggest credibility move for the research community. @@ -381,9 +383,9 @@ See `docs/superpowers/specs/2026-04-10-audrey-industry-standard-design.md` for t 5. **Python SDK requires running server** — The Python SDK is an HTTP client, not a native implementation. Users must run `npx audrey serve` separately. A native Python port is planned post-1.0 if demand warrants it. -6. **No OpenAPI spec** — The HTTP API has no auto-generated OpenAPI documentation. The Zod schemas exist and could generate one via `@hono/zod-openapi`, but it's not wired up yet. +6. **OpenAPI coverage still needs release-grade examples** - `/openapi.json` and `/docs` are wired through `OpenAPIHono`, but the spec should keep gaining richer request/response examples before 1.0. -7. **Benchmark uses mock embeddings** — The internal benchmark runs with mock embeddings (deterministic hashes, 64d). Real embedding providers would produce different (likely better) scores. The LoCoMo adapter (v0.21) will address this. +7. **Benchmark uses mock embeddings** - The internal benchmark runs with mock embeddings (deterministic hashes, 64d). Real embedding providers would produce different scores. The LoCoMo adapter is the next credibility milestone. ## How to Add Providers @@ -408,8 +410,8 @@ See `docs/superpowers/specs/2026-04-10-audrey-industry-standard-design.md` for t 1. Add route to `src/routes.ts` following the existing pattern 2. Add test to `tests/http-api.test.js` -3. Add corresponding method to Python SDK clients (`python-sdk/src/audrey_memory/client.py` and `async_client.py`) -4. Add Pydantic model to `python-sdk/src/audrey_memory/models.py` if new response shape +3. Add corresponding method to Python SDK clients (`python/audrey_memory/client.py` and `async_client.py`) +4. Add Pydantic model to `python/audrey_memory/models.py` if new response shape ### New MCP Tool @@ -478,8 +480,8 @@ Audrey's moat: biological fidelity (affect + interference + consolidation + drea 3. Merge to master with `--no-ff` 4. Tag: `git tag v0.X.0` 5. Bump VERSION in `mcp-server/config.ts` and `package.json` -6. If Python SDK changed, bump version in `python-sdk/pyproject.toml` -7. Publish: `npm publish` (Node.js) and `cd python-sdk && python -m build && twine upload dist/*` (Python) +6. If Python SDK changed, bump version in `python/audrey_memory/_version.py` +7. Publish: `npm publish` (Node.js) and `cd python && python -m build && twine upload dist/*` (Python) ## Codex-Specific Notes diff --git a/docs/assets/audrey-feature-grid.jpg b/docs/assets/audrey-feature-grid.jpg new file mode 100644 index 0000000..5c5e224 Binary files /dev/null and b/docs/assets/audrey-feature-grid.jpg differ diff --git a/docs/assets/audrey-logo.svg b/docs/assets/audrey-logo.svg new file mode 100644 index 0000000..06e8bf1 --- /dev/null +++ b/docs/assets/audrey-logo.svg @@ -0,0 +1,45 @@ + + Audrey logo + A luminous memory orb and Audrey wordmark for a local-first AI memory runtime. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Audrey + LOCAL-FIRST MEMORY FOR AGENTS + + recall / preflight / reflexes / consolidation / one SQLite brain + diff --git a/docs/assets/audrey-wordmark.png b/docs/assets/audrey-wordmark.png new file mode 100644 index 0000000..d6f6eae Binary files /dev/null and b/docs/assets/audrey-wordmark.png differ diff --git a/docs/audrey-for-dummies.md b/docs/audrey-for-dummies.md index 9fd0367..4c4b5c2 100644 --- a/docs/audrey-for-dummies.md +++ b/docs/audrey-for-dummies.md @@ -195,10 +195,11 @@ POST /v1/reflexes Run: ```bash +npx audrey doctor npx audrey demo ``` -This does not need API keys, Claude, Codex, Ollama, or any hosted model. +`doctor` checks whether Audrey can run on your machine. The demo does not need API keys, Claude, Codex, Ollama, or any hosted model. The demo: @@ -228,6 +229,8 @@ Examples: Generate host config: ```bash +npx audrey install --host codex --dry-run +npx audrey install --host generic --dry-run npx audrey mcp-config codex npx audrey mcp-config generic npx audrey mcp-config vscode @@ -410,8 +413,12 @@ Use these rules when deciding what Audrey should remember. ```bash # Run the local proof demo +npx audrey doctor npx audrey demo +# Preview host setup without editing files +npx audrey install --host codex --dry-run + # Print Codex MCP config npx audrey mcp-config codex @@ -428,6 +435,7 @@ npx audrey uninstall npx audrey serve # Check memory health +npx audrey doctor --json npx audrey status npx audrey status --json --fail-on-unhealthy @@ -555,6 +563,7 @@ Audrey can remember: Run: ```bash +npx audrey doctor npx audrey status node --version ``` @@ -566,6 +575,8 @@ Audrey requires Node.js 20 or newer. Generate a pinned config: ```bash +npx audrey install --host codex --dry-run +npx audrey install --host generic --dry-run npx audrey mcp-config codex npx audrey mcp-config generic ``` @@ -652,7 +663,7 @@ The best use is: ## Where To Go Next -- Run `npx audrey demo`. +- Run `npx audrey doctor`, then `npx audrey demo`. - Read `docs/mcp-hosts.md` to connect Codex, Claude, Cursor, Windsurf, VS Code, or JetBrains. - Read `docs/ollama-local-agents.md` for local Ollama-backed agents. - Read `docs/production-readiness.md` before using Audrey in a real deployment. diff --git a/docs/mcp-hosts.md b/docs/mcp-hosts.md index 1cdfbb1..8f785d1 100644 --- a/docs/mcp-hosts.md +++ b/docs/mcp-hosts.md @@ -5,11 +5,16 @@ Audrey ships as a local stdio MCP server. Claude Code is only one host; the same For pinned configs that launch the built Audrey entrypoint directly: ```bash +npx audrey doctor +npx audrey install --host codex --dry-run +npx audrey install --host generic --dry-run npx audrey mcp-config codex npx audrey mcp-config generic npx audrey mcp-config vscode ``` +`doctor` verifies the runtime, local memory store, provider configuration, and config-generation path. `install --host --dry-run` prints setup instructions without writing to a host config file. That is the safest first pass when Codex, Cursor, Windsurf, VS Code, or JetBrains manage their own config formats. + For portable configs that always resolve the latest published package, launch with `npx`: ```json @@ -46,6 +51,7 @@ Codex uses TOML under `C:\Users\\.codex\config.toml` on Windows. Generate a pinned block: ```bash +npx audrey install --host codex --dry-run npx audrey mcp-config codex ``` @@ -70,11 +76,12 @@ Use one shared `AUDREY_DATA_DIR` if Codex and other hosts should remember the sa Claude Code can use Audrey through the built-in installer: ```bash +npx audrey install --host claude-code --dry-run npx audrey install claude mcp list ``` -The installer persists a Claude Code `AUDREY_AGENT=claude-code` identity while still using the same Audrey MCP runtime as every other host. +The dry-run prints the exact shape before any host changes. The real installer persists a Claude Code `AUDREY_AGENT=claude-code` identity while still using the same Audrey MCP runtime as every other host. ## Claude Desktop diff --git a/docs/production-readiness.md b/docs/production-readiness.md index 47c18ce..ef716b4 100644 --- a/docs/production-readiness.md +++ b/docs/production-readiness.md @@ -2,7 +2,7 @@ Audrey is ready to be the memory layer inside a production agent system, but it is not a complete regulated-platform package by itself. Treat it as stateful infrastructure: pin providers, isolate tenants, monitor health, and wrap it with the controls your environment requires. -First contact should now go through `npx audrey mcp-config ` for local MCP hosts, `npx audrey install` for Claude Code specifically, or `npx audrey serve` for the sidecar path. Run `npx audrey status --json --fail-on-unhealthy` before exposing Audrey to real traffic. +First contact should now go through `npx audrey doctor`, then `npx audrey install --host --dry-run` for local MCP hosts, `npx audrey install` for Claude Code specifically, or `npx audrey serve` for the sidecar path. Run `npx audrey status --json --fail-on-unhealthy` before exposing Audrey to real traffic. ## Best Vertical Fit @@ -56,19 +56,24 @@ Guardrails: 1. Pin `AUDREY_EMBEDDING_PROVIDER` and `AUDREY_LLM_PROVIDER` explicitly. Do not rely on key-based auto-detection in production. 2. Set a dedicated `AUDREY_DATA_DIR` per environment and per tenant boundary. -3. Add a health check that runs `npx audrey status --json --fail-on-unhealthy`. +3. Add a startup check that runs `npx audrey doctor --json`. 4. Alert on `health.healthy=false` or `health.reembed_recommended=true`. 5. Schedule `npx audrey dream` during low-traffic windows so consolidation and decay stay current. 6. Backup the SQLite data directory before changing embedding dimensions or providers. 7. Treat re-embedding as a controlled maintenance action and validate with `npx audrey status`. -8. Keep API keys, bearer tokens, and raw credentials out of encoded memory content. -9. Decide whether `private` memories are allowed for your use case and document who can create them. -10. Add application-level encryption, access control, logging, and retention policies around Audrey. -11. On graceful shutdown paths, call `await brain.waitForIdle()` before `brain.close()` so tracked background work drains cleanly. +8. Use `npx audrey install --host --dry-run` in deployment docs so operators can preview host config without accidental writes. +9. Keep API keys, bearer tokens, and raw credentials out of encoded memory content. +10. Decide whether `private` memories are allowed for your use case and document who can create them. +11. Add application-level encryption, access control, logging, and retention policies around Audrey. +12. On graceful shutdown paths, call `await brain.waitForIdle()` before `brain.close()` so tracked background work drains cleanly. ## Operations Commands ```bash +# First-contact diagnostics +npx audrey doctor +npx audrey doctor --json + # Human-readable health npx audrey status diff --git a/mcp-server/config.ts b/mcp-server/config.ts index 1652b6e..cf5784a 100644 --- a/mcp-server/config.ts +++ b/mcp-server/config.ts @@ -3,7 +3,7 @@ import { join } from 'node:path'; import { fileURLToPath } from 'node:url'; import type { AudreyConfig, EmbeddingConfig, LLMConfig } from '../src/types.js'; -export const VERSION = '0.20.0'; +export const VERSION = '0.21.0'; export const SERVER_NAME = 'audrey-memory'; export const DEFAULT_AGENT = 'local-agent'; export const DEFAULT_DATA_DIR = join(homedir(), '.audrey', 'data'); diff --git a/mcp-server/index.ts b/mcp-server/index.ts index 5f97edc..9bc6517 100644 --- a/mcp-server/index.ts +++ b/mcp-server/index.ts @@ -1,6 +1,6 @@ #!/usr/bin/env node import { z } from 'zod'; -import { homedir, tmpdir } from 'node:os'; +import { homedir, platform, tmpdir } from 'node:os'; import { join, resolve } from 'node:path'; import { existsSync, mkdirSync, mkdtempSync, readFileSync, rmSync } from 'node:fs'; import { execFileSync } from 'node:child_process'; @@ -9,8 +9,9 @@ import { Audrey } from '../src/index.js'; import { readStoredDimensions } from '../src/db.js'; import type { AudreyConfig, EmbeddingProvider, IntrospectResult, MemoryStatusResult } from '../src/types.js'; import { - VERSION, + VERSION, SERVER_NAME, + MCP_ENTRYPOINT, buildAudreyConfig, buildInstallArgs, formatMcpHostConfig, @@ -147,7 +148,7 @@ export const memoryReflexesToolSchema = { // Local interface for status reporting // --------------------------------------------------------------------------- -interface StatusReport { +export interface StatusReport { generatedAt: string; registered: boolean; dataDir: string; @@ -156,12 +157,36 @@ interface StatusReport { stats: IntrospectResult | null; health: MemoryStatusResult | null; lastConsolidation: string | null; - error: string | null; -} - -// --------------------------------------------------------------------------- -// CLI subcommands -// --------------------------------------------------------------------------- + error: string | null; +} + +export type DoctorSeverity = 'info' | 'warning' | 'error'; + +export interface DoctorCheck { + name: string; + ok: boolean; + severity: DoctorSeverity; + message: string; + hint?: string; +} + +export interface DoctorReport { + generatedAt: string; + version: string; + node: string; + platform: string; + entrypoint: string; + dataDir: string; + embedding: string; + llm: string; + status: StatusReport; + checks: DoctorCheck[]; + ok: boolean; +} + +// --------------------------------------------------------------------------- +// CLI subcommands +// --------------------------------------------------------------------------- async function serveHttp(): Promise { const { startServer } = await import('../src/server.js'); @@ -436,10 +461,71 @@ async function reflect(): Promise { } } -function install(): void { - try { - execFileSync('claude', ['--version'], { stdio: 'ignore' }); - } catch { +interface InstallOptions { + host: string; + dryRun: boolean; +} + +function parseInstallOptions(argv: string[] = process.argv): InstallOptions { + let host = 'claude-code'; + let dryRun = false; + + for (let i = 3; i < argv.length; i += 1) { + const arg = argv[i] ?? ''; + if (arg === '--dry-run' || arg === '--print') { + dryRun = true; + } else if (arg === '--host') { + host = argv[i + 1] || host; + i += 1; + } else if (arg.startsWith('--host=')) { + host = arg.slice('--host='.length) || host; + } else if (!arg.startsWith('-')) { + host = arg; + } + } + + return { host, dryRun }; +} + +export function formatInstallGuide( + host: string, + env: Record = process.env, + dryRun = false, +): string { + const normalizedHost = host || 'claude-code'; + const title = dryRun || normalizedHost === 'claude-code' + ? `Audrey install preview for ${normalizedHost}` + : `Audrey config-only install for ${normalizedHost}`; + const lines = [ + title, + '', + 'No host config files were modified.', + '', + 'Generated MCP config:', + formatMcpHostConfig(normalizedHost, env), + '', + 'Next steps:', + ]; + + if (normalizedHost === 'claude-code') { + lines.push('- Run without --dry-run to register Audrey through Claude Code: npx audrey install --host claude-code'); + lines.push('- Verify with: claude mcp list'); + } else if (normalizedHost === 'codex') { + lines.push('- Paste the TOML block into C:\\Users\\\\.codex\\config.toml under the MCP server section.'); + lines.push('- Restart Codex, then run: codex mcp list'); + } else { + lines.push('- Paste the JSON block into your host MCP configuration.'); + lines.push('- Restart the host and look for the audrey-memory MCP server.'); + } + + lines.push('- Run a local health check any time with: npx audrey doctor'); + return lines.join('\n'); +} + +function installClaudeCode(): void { + try { + execFileSync('claude', ['--version'], { stdio: 'ignore' }); + } catch { console.error('Error: claude CLI not found. Install Claude Code first: https://docs.anthropic.com/en/docs/claude-code'); process.exit(1); } @@ -480,8 +566,8 @@ function install(): void { console.error('Failed to register MCP server. Is Claude Code installed and on your PATH?'); process.exit(1); } - - console.log(` + + console.log(` Audrey registered as "${SERVER_NAME}" with Claude Code. 19 MCP tools available in every session: @@ -507,7 +593,9 @@ Audrey registered as "${SERVER_NAME}" with Claude Code. CLI subcommands: npx audrey demo - Run a 60-second local proof with no network calls + npx audrey doctor - Diagnose runtime, store health, and host config readiness npx audrey install - Register MCP server with Claude Code + npx audrey install --host codex --dry-run - Print safe host setup instructions npx audrey mcp-config codex - Print Codex MCP TOML npx audrey mcp-config generic - Print JSON config for other MCP hosts npx audrey uninstall - Remove MCP server registration @@ -521,8 +609,24 @@ CLI subcommands: Data stored in: ${dataDir} Verify: claude mcp list -`); -} +`); +} + +function install(): void { + const options = parseInstallOptions(); + if (options.dryRun || options.host !== 'claude-code') { + try { + console.log(formatInstallGuide(options.host, process.env, options.dryRun)); + } catch (err) { + const message = err instanceof Error ? err.message : String(err); + console.error(`[audrey] install failed: ${message}`); + process.exit(2); + } + return; + } + + installClaudeCode(); +} function uninstall(): void { try { @@ -679,6 +783,7 @@ export async function runDemoCommand({ out(''); out('Next steps:'); + out('- Diagnose your setup: npx audrey doctor'); out('- Codex: npx audrey mcp-config codex'); out('- Any stdio MCP host: npx audrey mcp-config generic'); out('- Ollama/local agents: npx audrey serve, then call /v1/reflexes, /v1/capsule, and /v1/recall as tools'); @@ -786,11 +891,11 @@ export function formatStatusReport(report: StatusReport): string { return lines.join('\n'); } -export function runStatusCommand({ - argv = process.argv, - dataDir = resolveDataDir(process.env), - claudeJsonPath = join(homedir(), '.claude.json'), - out = console.log, +export function runStatusCommand({ + argv = process.argv, + dataDir = resolveDataDir(process.env), + claudeJsonPath = join(homedir(), '.claude.json'), + out = console.log, }: { argv?: string[]; dataDir?: string; @@ -808,20 +913,195 @@ export function runStatusCommand({ || (cliHasFlag('--fail-on-unhealthy', argv) && report.exists && report.health && !report.health.healthy) ? 1 : 0; - - return { report, exitCode }; -} - -function status(): void { - const { exitCode } = runStatusCommand(); - if (exitCode !== 0) { - process.exitCode = exitCode; - } -} - -function toolResult(data: unknown): { content: Array<{ type: 'text'; text: string }> } { - return { content: [{ type: 'text' as const, text: JSON.stringify(data) }] }; -} + + return { report, exitCode }; +} + +function describeEmbedding(env: Record): string { + const embedding = resolveEmbeddingProvider(env, env['AUDREY_EMBEDDING_PROVIDER']); + if (embedding.provider === 'local') { + return `local (${embedding.dimensions}d, device=${embedding.device || 'gpu'})`; + } + return `${embedding.provider} (${embedding.dimensions}d)`; +} + +function describeLlm(env: Record): string { + const llm = resolveLLMProvider(env, env['AUDREY_LLM_PROVIDER']); + return llm ? llm.provider : 'not configured (heuristic mode)'; +} + +function addDoctorCheck( + checks: DoctorCheck[], + name: string, + ok: boolean, + severity: DoctorSeverity, + message: string, + hint?: string, +): void { + checks.push({ name, ok, severity, message, ...(hint ? { hint } : {}) }); +} + +export function buildDoctorReport({ + dataDir = resolveDataDir(process.env), + claudeJsonPath = join(homedir(), '.claude.json'), + env = process.env, + nodeVersion = process.versions.node, +}: { + dataDir?: string; + claudeJsonPath?: string; + env?: Record; + nodeVersion?: string; +} = {}): DoctorReport { + const checks: DoctorCheck[] = []; + const statusReport = buildStatusReport({ dataDir, claudeJsonPath }); + const major = Number.parseInt(nodeVersion.split('.')[0] || '0', 10); + const entrypointExists = existsSync(MCP_ENTRYPOINT); + + addDoctorCheck( + checks, + 'node-runtime', + major >= 20, + major >= 20 ? 'info' : 'error', + `Node.js ${nodeVersion}`, + major >= 20 ? undefined : 'Install Node.js 20 or newer.', + ); + + addDoctorCheck( + checks, + 'mcp-entrypoint', + entrypointExists, + entrypointExists ? 'info' : 'error', + MCP_ENTRYPOINT, + entrypointExists ? undefined : 'Run npm run build before launching Audrey from this checkout.', + ); + + let embedding = 'invalid'; + try { + embedding = describeEmbedding(env); + addDoctorCheck(checks, 'embedding-provider', true, 'info', embedding); + } catch (err) { + const message = err instanceof Error ? err.message : String(err); + addDoctorCheck(checks, 'embedding-provider', false, 'error', message, 'Check AUDREY_EMBEDDING_PROVIDER.'); + } + + let llm = 'not configured (heuristic mode)'; + try { + llm = describeLlm(env); + addDoctorCheck(checks, 'llm-provider', true, 'info', llm); + } catch (err) { + const message = err instanceof Error ? err.message : String(err); + addDoctorCheck(checks, 'llm-provider', false, 'error', message, 'Check AUDREY_LLM_PROVIDER.'); + } + + if (!statusReport.exists) { + addDoctorCheck( + checks, + 'memory-store', + true, + 'info', + `${dataDir} is not created yet`, + 'Run npx audrey demo or connect a host to create the store.', + ); + } else if (statusReport.error) { + addDoctorCheck(checks, 'memory-store', false, 'error', statusReport.error, 'Run npx audrey status --json for details.'); + } else if (!statusReport.health) { + addDoctorCheck(checks, 'memory-store', false, 'error', 'memory store health could not be read'); + } else if (statusReport.health && !statusReport.health.healthy) { + addDoctorCheck(checks, 'memory-store', false, 'error', 'memory vectors are out of sync', 'Run npx audrey reembed.'); + } else { + addDoctorCheck(checks, 'memory-store', true, 'info', 'healthy'); + } + + try { + formatMcpHostConfig('codex', env); + formatMcpHostConfig('generic', env); + addDoctorCheck(checks, 'host-config-generation', true, 'info', 'codex TOML and generic JSON can be generated'); + } catch (err) { + const message = err instanceof Error ? err.message : String(err); + addDoctorCheck(checks, 'host-config-generation', false, 'error', message); + } + + const ok = checks.every(check => check.ok || check.severity !== 'error'); + return { + generatedAt: new Date().toISOString(), + version: VERSION, + node: nodeVersion, + platform: platform(), + entrypoint: MCP_ENTRYPOINT, + dataDir, + embedding, + llm, + status: statusReport, + checks, + ok, + }; +} + +export function formatDoctorReport(report: DoctorReport): string { + const lines = [ + `Audrey Doctor v${report.version}`, + `Runtime: Node.js ${report.node} on ${report.platform}`, + `MCP entrypoint: ${report.entrypoint}`, + `Data directory: ${report.dataDir}`, + `Embedding: ${report.embedding}`, + `LLM: ${report.llm}`, + `Store health: ${report.status.exists ? (report.status.health?.healthy ? 'healthy' : 'needs attention') : 'not initialized'}`, + '', + 'Checks:', + ]; + + for (const check of report.checks) { + const marker = check.ok ? 'OK' : check.severity.toUpperCase(); + lines.push(`- [${marker}] ${check.name}: ${check.message}`); + if (check.hint) lines.push(` hint: ${check.hint}`); + } + + lines.push(''); + lines.push(`Verdict: ${report.ok ? 'ready' : 'blocked'}`); + lines.push(''); + lines.push('Next steps:'); + lines.push('- Prove local behavior: npx audrey demo'); + lines.push('- Preview host setup: npx audrey install --host codex --dry-run'); + lines.push('- Emit automation JSON: npx audrey doctor --json'); + + return lines.join('\n'); +} + +export function runDoctorCommand({ + argv = process.argv, + dataDir = resolveDataDir(process.env), + claudeJsonPath = join(homedir(), '.claude.json'), + env = process.env, + out = console.log, +}: { + argv?: string[]; + dataDir?: string; + claudeJsonPath?: string; + env?: Record; + out?: (...args: unknown[]) => void; +} = {}): { report: DoctorReport; exitCode: number } { + const report = buildDoctorReport({ dataDir, claudeJsonPath, env }); + out(cliHasFlag('--json', argv) ? JSON.stringify(report, null, 2) : formatDoctorReport(report)); + return { report, exitCode: report.ok ? 0 : 1 }; +} + +function status(): void { + const { exitCode } = runStatusCommand(); + if (exitCode !== 0) { + process.exitCode = exitCode; + } +} + +function doctor(): void { + const { exitCode } = runDoctorCommand(); + if (exitCode !== 0) { + process.exitCode = exitCode; + } +} + +function toolResult(data: unknown): { content: Array<{ type: 'text'; text: string }> } { + return { content: [{ type: 'text' as const, text: JSON.stringify(data) }] }; +} function toolError(err: unknown): { isError: boolean; content: Array<{ type: 'text'; text: string }> } { return { isError: true, content: [{ type: 'text' as const, text: `Error: ${(err as Error).message || String(err)}` }] }; @@ -1553,9 +1833,11 @@ if (isDirectRun) { console.error('[audrey] serve failed:', err); process.exit(1); }); - } else if (subcommand === 'status') { - status(); - } else if (subcommand === 'observe-tool') { + } else if (subcommand === 'status') { + status(); + } else if (subcommand === 'doctor') { + doctor(); + } else if (subcommand === 'observe-tool') { observeToolCli().catch(err => { console.error('[audrey] observe-tool failed:', err); process.exit(1); diff --git a/package-lock.json b/package-lock.json index de0f6b7..a7e1898 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,12 +1,12 @@ { "name": "audrey", - "version": "0.20.0", + "version": "0.21.0", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "audrey", - "version": "0.20.0", + "version": "0.21.0", "license": "MIT", "dependencies": { "@hono/node-server": "^1.19.13", diff --git a/package.json b/package.json index 9d9e6cb..ba21e67 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "audrey", - "version": "0.20.0", + "version": "0.21.0", "description": "Local-first memory runtime for AI agents with recall, consolidation, memory reflexes, contradiction detection, and tool-trace learning", "type": "module", "main": "dist/src/index.js", @@ -31,8 +31,12 @@ "docs/future-of-llm-memory.md", "docs/mcp-hosts.md", "docs/ollama-local-agents.md", + "docs/assets/audrey-feature-grid.jpg", + "docs/assets/audrey-logo.svg", + "docs/assets/audrey-wordmark.png", "docs/assets/benchmarks/", "examples/", + "CHANGELOG.md", "README.md", "LICENSE" ], diff --git a/tests/mcp-server.test.js b/tests/mcp-server.test.js index 38bcd09..a9b07ee 100644 --- a/tests/mcp-server.test.js +++ b/tests/mcp-server.test.js @@ -16,7 +16,10 @@ import { } from '../dist/mcp-server/config.js'; import { MAX_MEMORY_CONTENT_LENGTH, + buildDoctorReport, buildStatusReport, + formatDoctorReport, + formatInstallGuide, formatStatusReport, initializeEmbeddingProvider, memoryEncodeToolSchema, @@ -28,6 +31,7 @@ import { registerShutdownHandlers, registerDreamTool, runDemoCommand, + runDoctorCommand, runStatusCommand, validateForgetSelection, } from '../dist/mcp-server/index.js'; @@ -36,8 +40,8 @@ import { existsSync, rmSync } from 'node:fs'; const TEST_DIR = './test-mcp-server'; describe('MCP config', () => { - it('VERSION is 0.20.0', () => { - expect(VERSION).toBe('0.20.0'); + it('VERSION is 0.21.0', () => { + expect(VERSION).toBe('0.21.0'); }); }); @@ -235,6 +239,23 @@ describe('MCP CLI: host-neutral config output', () => { }); }); +describe('MCP CLI: install guidance', () => { + it('prints safe Codex setup without mutating host files', () => { + const text = formatInstallGuide('codex', {}, true); + expect(text).toContain('No host config files were modified'); + expect(text).toContain(`[mcp_servers.${SERVER_NAME}]`); + expect(text).toContain('AUDREY_AGENT = "codex"'); + expect(text).toContain('npx audrey doctor'); + }); + + it('prints a Claude Code dry-run path before invoking the installer', () => { + const text = formatInstallGuide('claude-code', {}, true); + expect(text).toContain('claude-code'); + expect(text).toContain('Run without --dry-run'); + expect(text).toContain('AUDREY_AGENT'); + }); +}); + describe('MCP CLI: demo command', () => { it('prints a self-contained memory demo without external services', async () => { const lines = []; @@ -243,6 +264,7 @@ describe('MCP CLI: demo command', () => { expect(output).toContain('Audrey 60-second memory demo'); expect(output).toContain('Capsule highlights:'); expect(output).toContain('Recall proof:'); + expect(output).toContain('npx audrey doctor'); expect(output).toContain('npx audrey mcp-config codex'); }); }); @@ -426,6 +448,72 @@ describe('MCP status automation', () => { }); }); +describe('MCP doctor automation', () => { + afterEach(() => { + if (existsSync(TEST_DIR)) rmSync(TEST_DIR, { recursive: true, force: true }); + }); + + it('builds a ready report for first-run installs without an existing store', () => { + const report = buildDoctorReport({ + dataDir: './missing-audrey-dir', + claudeJsonPath: './missing-claude-config.json', + env: {}, + nodeVersion: '20.0.0', + }); + + expect(report.version).toBe(VERSION); + expect(report.entrypoint).toBe(MCP_ENTRYPOINT); + expect(report.ok).toBe(true); + expect(report.status.exists).toBe(false); + expect(report.checks.some(check => check.name === 'host-config-generation' && check.ok)).toBe(true); + }); + + it('formats doctor output with a clear verdict and next steps', () => { + const report = buildDoctorReport({ + dataDir: './missing-audrey-dir', + claudeJsonPath: './missing-claude-config.json', + env: {}, + nodeVersion: '20.0.0', + }); + const text = formatDoctorReport(report); + + expect(text).toContain('Audrey Doctor'); + expect(text).toContain('Store health: not initialized'); + expect(text).toContain('Verdict: ready'); + expect(text).toContain('npx audrey install --host codex --dry-run'); + }); + + it('emits JSON and exits non-zero when the store needs repair', async () => { + const audrey = new Audrey({ + dataDir: TEST_DIR, + agent: 'doctor-json-test', + embedding: { provider: 'mock', dimensions: 8 }, + }); + + await audrey.encode({ content: 'doctor health drift episode', source: 'direct-observation' }); + audrey.db.exec('DELETE FROM vec_episodes'); + audrey.db.prepare( + "UPDATE audrey_config SET value = ? WHERE key = 'dimensions'" + ).run('16'); + audrey.close(); + + const lines = []; + const { report, exitCode } = runDoctorCommand({ + argv: ['node', 'mcp-server/index.js', 'doctor', '--json'], + dataDir: TEST_DIR, + claudeJsonPath: './missing-claude-config.json', + env: {}, + out: line => lines.push(line), + }); + + expect(exitCode).toBe(1); + expect(report.ok).toBe(false); + const parsed = JSON.parse(lines[0]); + expect(parsed.ok).toBe(false); + expect(parsed.checks.some(check => check.name === 'memory-store' && !check.ok)).toBe(true); + }); +}); + describe('MCP tool: memory_encode', () => { let audrey;