marlian · marlian · Apr 15, 2026 · Apr 14, 2026 · Apr 14, 2026 · Apr 14, 2026
diff --git a/DECISION_LOG.md b/DECISION_LOG.md
@@ -26,6 +26,35 @@ Add a `workmem backup` subcommand that writes an age-encrypted snapshot of the g
 - **Include project-scoped DBs automatically.** Rejected: project DBs belong to workspaces, not to the user's top-level knowledge. Auto-including them couples backup to filesystem scanning and makes the unit of restore ambiguous. A `backup` invocation per workspace is explicit.
 - **Include telemetry.db in the snapshot.** Rejected: telemetry is operational, rebuildable, and has a different lifecycle than knowledge. Mixing them also risks leaking telemetry via recall if paths cross.
 
+## 2026-04-14: Port telemetry with Go-native refinements and add privacy-strict mode
+
+### Context
+
+The Node reference implementation shipped with opt-in telemetry in a separate SQLite DB (schemas: `tool_calls`, `search_metrics`). Phase 3 deferred the port until the Go MCP entrypoint was real and adopted. That condition is now met: the Go binary serves Claude Code, Kilo, and Codex in production. Time to port.
+
+### Decision
+
+Port the Node telemetry design to Go and preserve the guiding principles (opt-in via env, separate database, counts-only for results, content replaced with `<N chars>`). Refine three Node-era shortcuts:
+
+1. **Nil-tolerant `*Client`** — the client value is `nil` when disabled, every method returns immediately on `nil` receiver. Replaces per-callsite `if TELEMETRY_ENABLED` checks.
+2. **No globals** — the client is constructed in `cmd/workmem/main.go` and plumbed via `mcpserver.Config{Telemetry: …}`. Replaces the Node pattern of module-level mutable state (`_telemetryDb`, `_lastSearchMetrics`, etc.).
+3. **`SearchMemory` returns `SearchMetrics` as a tuple** — `(results []SearchObservation, metrics SearchMetrics, err error)`. Replaces the Node `_lastSearchMetrics` side-channel.
+
+Add a new **privacy-strict mode** (`MEMORY_TELEMETRY_PRIVACY=strict`): entity names, queries, and event labels are sha256-hashed before storage. Intended for sensitive backends (e.g., the `private_memory` server backing therapy/health/relationship content).
+
+### Rationale
+
+- Node-era globals would have been awkward in Go and hard to test under parallel `t.Run` — eliminating them keeps the test story clean.
+- Privacy-strict closes a real threat: local plaintext telemetry DB on a laptop with sensitive entity names is a leak vector if the laptop is lost/sync'd/exported. Strict mode lets one binary serve two wiring contexts (permissive `memory`, strict `private_memory`) cleanly.
+- `SearchMemory` returning metrics as a proper value is idiomatic Go and testable in isolation without the telemetry package.
+- Using `modernc.org/sqlite` for the telemetry DB keeps the pure-Go single-binary invariant (no CGO addition just for the analytics path).
+
+### Alternatives considered
+
+- **1:1 port with globals** — Rejected because Go's `database/sql` + `sql.Stmt` lifecycle around a package-level mutable pointer becomes painful under test; the nil-client pattern is simpler and safer.
+- **Attach telemetry as an MCP tool** — Rejected as in the Node design: telemetry is human-developer infrastructure, not a model capability. Adding a tool wastes context tokens on every call for every client.
+- **Encryption at rest on the telemetry DB with a keychain-stored key** — Rejected for this iteration. Cross-platform keychain integration (macOS/Windows/Linux headless) is a bigger cantiere than hashing the sensitive fields. Revisit if the strict mode proves insufficient in practice.
+
 ## 2026-04-14: Use the official Go MCP SDK for transport
 
 ### Context

diff --git a/IMPLEMENTATION.md b/IMPLEMENTATION.md
@@ -108,4 +108,19 @@ Ship a `workmem backup` subcommand that produces an age-encrypted snapshot of th
 - [x] Wire `backup` subcommand in `cmd/workmem/main.go` with `--to`, `--age-recipient` (repeatable), `--db`, `--env-file`
 - [x] README section documenting usage and manual `age -d` restore
 
-**On Step Gate (all items [x]):** trigger correctness review focused on crypto wiring and VACUUM INTO error paths.
+**On Step Gate (all items [x]):** trigger correctness review focused on crypto wiring and VACUUM INTO error paths.
+
+### Step 3.5: Telemetry [✅]
+
+Port the Node telemetry design to Go with Go-native refinements and a new privacy-strict mode. **Gate:** when `MEMORY_TELEMETRY_PATH` is set, every tool call lands a row in `tool_calls`; every `recall` lands a row in `search_metrics` linked by `tool_call_id`; when unset, no DB is created and no overhead is added. In `MEMORY_TELEMETRY_PRIVACY=strict` mode, entity/query/label values are sha256-hashed before storage.
+
+- [x] Build `internal/telemetry` package (nil-tolerant Client, schema, sanitize, hash, detect)
+- [x] Refactor `SearchMemory` to return `(results, metrics, err)` — no globals, no side channels
+- [x] Wire `*telemetry.Client` through `cmd/workmem/main.go` and `mcpserver.Config`
+- [x] Wrap `mcpserver` dispatch with duration + args/result sanitization + LogToolCall/LogSearchMetrics
+- [x] Unit tests for package (nil-client safety, init failure, strict hashing, sanitize, detect)
+- [x] Integration tests: enabled roundtrip / disabled zero overhead / privacy-strict
+- [x] `docs/TELEMETRY.md` adapted for Go with privacy-strict documented
+- [x] Telemetry invariants wired into `OPERATIONS.md`
+
+**On Step Gate (all items [x]):** trigger correctness review on telemetry hook points and strict-mode hashing.
diff --git a/OPERATIONS.md b/OPERATIONS.md
@@ -11,6 +11,9 @@
 - Live-data queries must never bypass tombstone guards.
 - FTS cleanup must never use raw `DELETE` against a contentless FTS table.
 - `remember_event` must be atomic: the event row and all attached observations commit together or not at all. Proof: `TestRememberEventAtomicityOnMidLoopFailure` in `internal/store/parity_test.go`.
+- Telemetry is opt-in (`MEMORY_TELEMETRY_PATH`) and never affects the tool call success path. Init failure logs a single warning to stderr and disables telemetry for the session; the main memory DB is unaffected.
+- Telemetry data lives in its own SQLite file, physically separate from the memory database. No foreign keys, no joins, no shared lifecycle.
+- When `MEMORY_TELEMETRY_PRIVACY=strict`, entity names, queries, and event labels must be sha256-hashed before reaching disk. Observation/content values are always reduced to `<N chars>` regardless of mode.
 
 ## Active Debt
 
@@ -32,12 +35,6 @@ Blast radius: Late failures on Linux or Windows packaging, or FTS behavior drift
 Fix: Keep the canary in CI and run it on at least macOS, Linux, and Windows before calling the persistence layer portable.
 Done when: the same schema/FTS canary passes in cross-build validation.
 
-- Telemetry parity is consciously deferred until the new Go MCP entrypoint is wired into a real client and the request path is considered stable.
-Trigger: Instrumenting before the client-facing transport contract has been debugged in practice.
-Blast radius: Busywork telemetry code tied to temporary wiring.
-Fix: keep telemetry scope documented; implement it once the Kilo-facing transport path is stable.
-Done when: the Go MCP entrypoint is live under a real client and tool-call telemetry can be attached once, not retrofitted twice.
-
 - FTS5 viability is proven locally on the chosen driver, but not yet in a cross-platform validation matrix.
 Trigger: Assuming a passing local canary implies release-target portability.
 Blast radius: Search or forget semantics break only after packaging or OS expansion.
@@ -54,15 +51,10 @@ Done when: FTS-specific parity tests pass across the release matrix.
 
 ### P2
 
-- Telemetry schema and migration strategy are not yet designed.
-Trigger: Reaching post-parity milestone without an observability plan.
-Blast radius: Delayed adoption of telemetry in the Go port.
-Fix: Define minimal telemetry compatibility after core parity lands.
-Done when: telemetry design is recorded and scheduled.
+- None active.
 
 ## Pre-Launch TODO
 
-- Prove MCP stdio compatibility with Kilo or another real client.
 - Prove schema initialization and migrations on clean and upgraded DBs.
 - Prove forget semantics including FTS deletion.
 - Prove project isolation.

diff --git a/cmd/workmem/main.go b/cmd/workmem/main.go
@@ -12,6 +12,7 @@ import (
 	"workmem/internal/dotenv"
 	"workmem/internal/mcpserver"
 	"workmem/internal/store"
+	"workmem/internal/telemetry"
 )
 
 func main() {
@@ -101,8 +102,16 @@ func runMCP(args []string) {
 
 	loadEnvFile(*envFile)
 
-	runtime, err := mcpserver.New(mcpserver.Config{DBPath: *dbPath})
+	// Ownership of the telemetry client transfers to the Runtime only after
+	// New returns successfully. If New fails, the DB was already opened by
+	// FromEnv and must be closed here — otherwise the handle leaks.
+	tele := telemetry.FromEnv()
+	runtime, err := mcpserver.New(mcpserver.Config{
+		DBPath:    *dbPath,
+		Telemetry: tele,
+	})
 	if err != nil {
+		_ = tele.Close() // nil-safe no-op when telemetry is disabled
 		fmt.Fprintf(os.Stderr, "start mcp server: %v\n", err)
 		os.Exit(1)
 	}

diff --git a/docs/TELEMETRY.md b/docs/TELEMETRY.md
@@ -0,0 +1,190 @@
+# Telemetry
+
+> Opt-in usage analytics for workmem. Zero overhead when disabled. Separate database, strict privacy controls.
+
+## Enabling
+
+Set `MEMORY_TELEMETRY_PATH` to a file path. When telemetry is enabled, the database is opened and the schema is initialized at process startup (before any tool call). If initialization fails, a single warning is printed to stderr and telemetry is disabled for the rest of the session — the main memory path is never affected.
+
+**Via `.env`:**
+```bash
+MEMORY_TELEMETRY_PATH=./telemetry.db
+```
+
+**Via client config (example for Claude Code's `~/.claude.json`):**
+```json
+{
+  "memory": {
+    "command": "/path/to/workmem",
+    "args": ["-env-file", "/path/to/memory.env"],
+    "env": {
+      "MEMORY_TELEMETRY_PATH": "/absolute/path/to/telemetry.db"
+    }
+  }
+}
+```
+
+When `MEMORY_TELEMETRY_PATH` is unset (the default), telemetry is disabled end-to-end: no database is opened, no schema is written, no rows are inserted. The dispatch wrapper skips its timing block entirely when the telemetry client is `nil`, so the only cost on the no-telemetry path is a single pointer-nil check per tool call.
+
+## Privacy modes
+
+Telemetry supports two modes, controlled by `MEMORY_TELEMETRY_PRIVACY`:
+
+| Value | Mode | Behavior |
+|-------|------|----------|
+| (unset) / any other value | **permissive** (default) | entity names, queries, and event labels are stored in plaintext |
+| `strict` | **strict** | entity names, queries, and event labels are sha256-hashed before storage |
+
+Strict mode is intended for sensitive instances such as a `private_memory` server backing therapy/health/relationship content. Ranking debug ("which queries overfetch candidates?") becomes harder in strict mode because plaintext queries are no longer recoverable — but sensitive identifiers never land on disk.
+
+Observation/content values are **always** reduced to `<N chars>` regardless of mode. Batched-array fields are **always** reduced to a count marker using the field name — `facts` becomes `<N facts>`, `observations` becomes `<N observations>`. Strict mode only changes what happens to identifier-like fields.
+
+**Example `.env` for sensitive backend:**
+```bash
+MEMORY_TELEMETRY_PATH=/home/user/.local/state/workmem/private-telemetry.db
+MEMORY_TELEMETRY_PRIVACY=strict
+```
+
+## What it logs
+
+### Tool calls (`tool_calls` table)
+
+Every MCP tool invocation is logged with:
+
+| Column | Example |
+|--------|---------|
+| `ts` | `2026-04-14T20:15:32.456` |
+| `tool` | `recall`, `remember`, `forget` |
+| `client_name` | `kilo`, `claude-code`, `cursor`, `windsurf`, `vscode-copilot` |
+| `client_version` | `0.43.6` |
+| `client_source` | `protocol` / `env` / `none` |
+| `db_scope` | `global` / `project` |
+| `project_path` | resolved absolute path, or null |
+| `duration_ms` | `12.4` |
+| `args_summary` | Sanitized JSON (see below) |
+| `result_summary` | Counts only, never data |
+| `is_error` | `0` or `1` |
+
+### Search ranking metrics (`search_metrics` table)
+
+For `recall` calls, additional metrics capture the ranking pipeline:
+
+| Column | Example |
+|--------|---------|
+| `tool_call_id` | FK into `tool_calls.id` |
+| `query` | Search text (hashed in strict mode) |
+| `channels` | `{"fts": 12, "fts_phrase": 3, "entity_exact": 1}` |
+| `candidates_total` | `16` |
+| `results_returned` | `5` |
+| `limit_requested` | `20` |
+| `score_min` | `0.32` |
+| `score_max` | `0.87` |
+| `score_median` | `0.61` |
+| `compact` | `0` or `1` |
+
+## What it does NOT log
+
+- Observation content (replaced with `<N chars>`, always)
+- Full result payloads — only counts (entities returned, observations stored, etc.)
+- In strict mode, any identifier (entity name, query, event label, from/to)
+
+## Client identity
+
+The server identifies which client is calling through two mechanisms:
+
+1. **MCP protocol** (primary) — the `initialize` handshake includes `clientInfo.name` and `clientInfo.version`. This is a required field in the MCP spec.
+2. **Environment fingerprinting** (fallback) — when the protocol doesn't provide client info, the server detects the client from environment variables:
+
+| Client | Signal |
+|--------|--------|
+| Kilo | `KILO=1` (version from `KILOCODE_VERSION`) |
+| Claude Code | `CLAUDE_CODE_SSE_PORT` set |
+| Cursor | `CURSOR_TRACE_ID` set |
+| Windsurf | `WINDSURF_EXTENSION_ID` set |
+| VS Code Copilot | `VSCODE_MCP_HTTP_PREFER` set non-empty |
+| VS Code (unknown extension) | `TERM_PROGRAM=vscode` |
+
+The `client_source` column tells you which mechanism fired: `protocol`, `env`, or `none`.
+
+## Querying the data
+
+The telemetry database is a standard SQLite file. Open it with any tool: `sqlite3`, DBeaver, Jupyter, pandas, etc.
+
+### Example queries
+
+**Tool usage by client:**
+```sql
+SELECT client_name, tool, COUNT(*) as calls,
+       ROUND(AVG(duration_ms), 1) as avg_ms
+FROM tool_calls
+GROUP BY client_name, tool
+ORDER BY calls DESC;
+```
+
+**Search ranking quality (permissive mode — query is plaintext):**
+```sql
+SELECT query, candidates_total, results_returned,
+       ROUND(score_min, 3) as min, ROUND(score_max, 3) as max,
+       channels
+FROM search_metrics
+ORDER BY candidates_total DESC
+LIMIT 20;
+```
+
+**Overfetch detection (candidates >> returned):**
+```sql
+SELECT query, candidates_total, results_returned, limit_requested,
+       ROUND(1.0 * results_returned / candidates_total, 2) as yield_ratio
+FROM search_metrics
+WHERE candidates_total > 0
+ORDER BY yield_ratio ASC
+LIMIT 20;
+```
+
+**Error rate by tool:**
+```sql
+SELECT tool, COUNT(*) as total,
+       SUM(is_error) as errors,
+       ROUND(100.0 * SUM(is_error) / COUNT(*), 1) as error_pct
+FROM tool_calls
+GROUP BY tool
+ORDER BY error_pct DESC;
+```
+
+**Channel effectiveness:**
+```sql
+SELECT json_each.key as channel, COUNT(*) as appearances
+FROM search_metrics, json_each(search_metrics.channels)
+GROUP BY channel
+ORDER BY appearances DESC;
+```
+
+## Separate database
+
+Telemetry is stored in its own SQLite file, completely separate from `memory.db`. This means:
+
+- Deleting the telemetry DB has zero impact on your knowledge graph
+- The telemetry DB can be wiped and recreated at any time
+- No foreign keys or joins between telemetry and memory data
+- Telemetry uses `journal_mode=WAL` for concurrent reads while the server writes
+
+## Init failure handling
+
+If the telemetry path is invalid or the database can't be opened, the server prints a single warning to stderr and disables telemetry for the rest of the session. It does not retry on every call. The main `memory.db` is unaffected — telemetry failure never breaks the tool call path.
+
+Example warning:
+```
+[memory] telemetry init failed (disabled for this session): unable to open database file
+```
+
+## Design rationale
+
+**Why plaintext queries in permissive mode?** Local, single-user development: the telemetry DB lives on your machine, is gitignored, and is only readable by you. Redacting queries permissively would make the analytics useless — you can't answer "which queries produce too many candidates?" if the query text is hashed.
+
+**Why privacy-strict mode?** For backends holding sensitive content (therapy, health, relationships, personal journaling), even local plaintext can matter: laptop loss, accidental sync-folder placement, or exported snapshots. Strict mode ensures entity names and queries never land on disk in the clear.
+
+**Why a separate SQLite and not the memory DB?** Separation of concerns. The memory DB is your knowledge graph. Telemetry is operational data with a different lifecycle (wipe freely, aggregate, export). Mixing them risks accidentally leaking telemetry via `recall`, or losing telemetry history when the memory DB is rebuilt.
+
+**Why not an MCP tool?** Telemetry is infrastructure, not a capability the model needs. Adding a tool would cost context tokens on every call for something only the human developer uses. Query the SQLite directly.
+
+**Why a nil-tolerant `*Client`?** The alternative is `if TELEMETRY_ENABLED` checks sprinkled across every call site. The `nil`-receiver pattern keeps the dispatch code clean — the wrapper always calls `LogToolCall`; when telemetry is disabled, the client is `nil` and the method returns immediately.