Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions DECISION_LOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,35 @@ Add a `workmem backup` subcommand that writes an age-encrypted snapshot of the g
- **Include project-scoped DBs automatically.** Rejected: project DBs belong to workspaces, not to the user's top-level knowledge. Auto-including them couples backup to filesystem scanning and makes the unit of restore ambiguous. A `backup` invocation per workspace is explicit.
- **Include telemetry.db in the snapshot.** Rejected: telemetry is operational, rebuildable, and has a different lifecycle than knowledge. Mixing them also risks leaking telemetry via recall if paths cross.

## 2026-04-14: Port telemetry with Go-native refinements and add privacy-strict mode

### Context

The Node reference implementation shipped with opt-in telemetry in a separate SQLite DB (schemas: `tool_calls`, `search_metrics`). Phase 3 deferred the port until the Go MCP entrypoint was real and adopted. That condition is now met: the Go binary serves Claude Code, Kilo, and Codex in production. Time to port.

### Decision

Port the Node telemetry design to Go and preserve the guiding principles (opt-in via env, separate database, counts-only for results, content replaced with `<N chars>`). Refine three Node-era shortcuts:

1. **Nil-tolerant `*Client`** — the client value is `nil` when disabled, every method returns immediately on `nil` receiver. Replaces per-callsite `if TELEMETRY_ENABLED` checks.
2. **No globals** — the client is constructed in `cmd/workmem/main.go` and plumbed via `mcpserver.Config{Telemetry: …}`. Replaces the Node pattern of module-level mutable state (`_telemetryDb`, `_lastSearchMetrics`, etc.).
3. **`SearchMemory` returns `SearchMetrics` as a tuple** — `(results []SearchObservation, metrics SearchMetrics, err error)`. Replaces the Node `_lastSearchMetrics` side-channel.

Add a new **privacy-strict mode** (`MEMORY_TELEMETRY_PRIVACY=strict`): entity names, queries, and event labels are sha256-hashed before storage. Intended for sensitive backends (e.g., the `private_memory` server backing therapy/health/relationship content).

### Rationale

- Node-era globals would have been awkward in Go and hard to test under parallel `t.Run` — eliminating them keeps the test story clean.
- Privacy-strict closes a real threat: local plaintext telemetry DB on a laptop with sensitive entity names is a leak vector if the laptop is lost/sync'd/exported. Strict mode lets one binary serve two wiring contexts (permissive `memory`, strict `private_memory`) cleanly.
- `SearchMemory` returning metrics as a proper value is idiomatic Go and testable in isolation without the telemetry package.
- Using `modernc.org/sqlite` for the telemetry DB keeps the pure-Go single-binary invariant (no CGO addition just for the analytics path).

### Alternatives considered

- **1:1 port with globals** — Rejected because Go's `database/sql` + `sql.Stmt` lifecycle around a package-level mutable pointer becomes painful under test; the nil-client pattern is simpler and safer.
- **Attach telemetry as an MCP tool** — Rejected as in the Node design: telemetry is human-developer infrastructure, not a model capability. Adding a tool wastes context tokens on every call for every client.
- **Encryption at rest on the telemetry DB with a keychain-stored key** — Rejected for this iteration. Cross-platform keychain integration (macOS/Windows/Linux headless) is a bigger cantiere than hashing the sensitive fields. Revisit if the strict mode proves insufficient in practice.

## 2026-04-14: Use the official Go MCP SDK for transport

### Context
Expand Down
17 changes: 16 additions & 1 deletion IMPLEMENTATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,4 +108,19 @@ Ship a `workmem backup` subcommand that produces an age-encrypted snapshot of th
- [x] Wire `backup` subcommand in `cmd/workmem/main.go` with `--to`, `--age-recipient` (repeatable), `--db`, `--env-file`
- [x] README section documenting usage and manual `age -d` restore

**On Step Gate (all items [x]):** trigger correctness review focused on crypto wiring and VACUUM INTO error paths.
**On Step Gate (all items [x]):** trigger correctness review focused on crypto wiring and VACUUM INTO error paths.

### Step 3.5: Telemetry [✅]

Port the Node telemetry design to Go with Go-native refinements and a new privacy-strict mode. **Gate:** when `MEMORY_TELEMETRY_PATH` is set, every tool call lands a row in `tool_calls`; every `recall` lands a row in `search_metrics` linked by `tool_call_id`; when unset, no DB is created and no overhead is added. In `MEMORY_TELEMETRY_PRIVACY=strict` mode, entity/query/label values are sha256-hashed before storage.

- [x] Build `internal/telemetry` package (nil-tolerant Client, schema, sanitize, hash, detect)
- [x] Refactor `SearchMemory` to return `(results, metrics, err)` — no globals, no side channels
- [x] Wire `*telemetry.Client` through `cmd/workmem/main.go` and `mcpserver.Config`
- [x] Wrap `mcpserver` dispatch with duration + args/result sanitization + LogToolCall/LogSearchMetrics
- [x] Unit tests for package (nil-client safety, init failure, strict hashing, sanitize, detect)
- [x] Integration tests: enabled roundtrip / disabled zero overhead / privacy-strict
- [x] `docs/TELEMETRY.md` adapted for Go with privacy-strict documented
- [x] Telemetry invariants wired into `OPERATIONS.md`

**On Step Gate (all items [x]):** trigger correctness review on telemetry hook points and strict-mode hashing.
16 changes: 4 additions & 12 deletions OPERATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@
- Live-data queries must never bypass tombstone guards.
- FTS cleanup must never use raw `DELETE` against a contentless FTS table.
- `remember_event` must be atomic: the event row and all attached observations commit together or not at all. Proof: `TestRememberEventAtomicityOnMidLoopFailure` in `internal/store/parity_test.go`.
- Telemetry is opt-in (`MEMORY_TELEMETRY_PATH`) and never affects the tool call success path. Init failure logs a single warning to stderr and disables telemetry for the session; the main memory DB is unaffected.
- Telemetry data lives in its own SQLite file, physically separate from the memory database. No foreign keys, no joins, no shared lifecycle.
- When `MEMORY_TELEMETRY_PRIVACY=strict`, entity names, queries, and event labels must be sha256-hashed before reaching disk. Observation/content values are always reduced to `<N chars>` regardless of mode.

## Active Debt

Expand All @@ -32,12 +35,6 @@ Blast radius: Late failures on Linux or Windows packaging, or FTS behavior drift
Fix: Keep the canary in CI and run it on at least macOS, Linux, and Windows before calling the persistence layer portable.
Done when: the same schema/FTS canary passes in cross-build validation.

- Telemetry parity is consciously deferred until the new Go MCP entrypoint is wired into a real client and the request path is considered stable.
Trigger: Instrumenting before the client-facing transport contract has been debugged in practice.
Blast radius: Busywork telemetry code tied to temporary wiring.
Fix: keep telemetry scope documented; implement it once the Kilo-facing transport path is stable.
Done when: the Go MCP entrypoint is live under a real client and tool-call telemetry can be attached once, not retrofitted twice.

- FTS5 viability is proven locally on the chosen driver, but not yet in a cross-platform validation matrix.
Trigger: Assuming a passing local canary implies release-target portability.
Blast radius: Search or forget semantics break only after packaging or OS expansion.
Expand All @@ -54,15 +51,10 @@ Done when: FTS-specific parity tests pass across the release matrix.

### P2

- Telemetry schema and migration strategy are not yet designed.
Trigger: Reaching post-parity milestone without an observability plan.
Blast radius: Delayed adoption of telemetry in the Go port.
Fix: Define minimal telemetry compatibility after core parity lands.
Done when: telemetry design is recorded and scheduled.
- None active.

## Pre-Launch TODO

- Prove MCP stdio compatibility with Kilo or another real client.
- Prove schema initialization and migrations on clean and upgraded DBs.
- Prove forget semantics including FTS deletion.
- Prove project isolation.
Expand Down
11 changes: 10 additions & 1 deletion cmd/workmem/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"workmem/internal/dotenv"
"workmem/internal/mcpserver"
"workmem/internal/store"
"workmem/internal/telemetry"
)

func main() {
Expand Down Expand Up @@ -101,8 +102,16 @@ func runMCP(args []string) {

loadEnvFile(*envFile)

runtime, err := mcpserver.New(mcpserver.Config{DBPath: *dbPath})
// Ownership of the telemetry client transfers to the Runtime only after
// New returns successfully. If New fails, the DB was already opened by
// FromEnv and must be closed here — otherwise the handle leaks.
tele := telemetry.FromEnv()
runtime, err := mcpserver.New(mcpserver.Config{
DBPath: *dbPath,
Telemetry: tele,
})
if err != nil {
_ = tele.Close() // nil-safe no-op when telemetry is disabled
fmt.Fprintf(os.Stderr, "start mcp server: %v\n", err)
os.Exit(1)
}
Expand Down
190 changes: 190 additions & 0 deletions docs/TELEMETRY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
# Telemetry

> Opt-in usage analytics for workmem. Zero overhead when disabled. Separate database, strict privacy controls.

## Enabling

Set `MEMORY_TELEMETRY_PATH` to a file path. When telemetry is enabled, the database is opened and the schema is initialized at process startup (before any tool call). If initialization fails, a single warning is printed to stderr and telemetry is disabled for the rest of the session — the main memory path is never affected.

**Via `.env`:**
```bash
MEMORY_TELEMETRY_PATH=./telemetry.db
```

**Via client config (example for Claude Code's `~/.claude.json`):**
```json
{
"memory": {
"command": "/path/to/workmem",
"args": ["-env-file", "/path/to/memory.env"],
"env": {
"MEMORY_TELEMETRY_PATH": "/absolute/path/to/telemetry.db"
}
}
}
```

When `MEMORY_TELEMETRY_PATH` is unset (the default), telemetry is disabled end-to-end: no database is opened, no schema is written, no rows are inserted. The dispatch wrapper skips its timing block entirely when the telemetry client is `nil`, so the only cost on the no-telemetry path is a single pointer-nil check per tool call.

## Privacy modes

Telemetry supports two modes, controlled by `MEMORY_TELEMETRY_PRIVACY`:

| Value | Mode | Behavior |
|-------|------|----------|
| (unset) / any other value | **permissive** (default) | entity names, queries, and event labels are stored in plaintext |
| `strict` | **strict** | entity names, queries, and event labels are sha256-hashed before storage |

Strict mode is intended for sensitive instances such as a `private_memory` server backing therapy/health/relationship content. Ranking debug ("which queries overfetch candidates?") becomes harder in strict mode because plaintext queries are no longer recoverable — but sensitive identifiers never land on disk.

Observation/content values are **always** reduced to `<N chars>` regardless of mode. Batched-array fields are **always** reduced to a count marker using the field name — `facts` becomes `<N facts>`, `observations` becomes `<N observations>`. Strict mode only changes what happens to identifier-like fields.

**Example `.env` for sensitive backend:**
```bash
MEMORY_TELEMETRY_PATH=/home/user/.local/state/workmem/private-telemetry.db
MEMORY_TELEMETRY_PRIVACY=strict
```

## What it logs

### Tool calls (`tool_calls` table)

Every MCP tool invocation is logged with:

| Column | Example |
|--------|---------|
| `ts` | `2026-04-14T20:15:32.456` |
| `tool` | `recall`, `remember`, `forget` |
| `client_name` | `kilo`, `claude-code`, `cursor`, `windsurf`, `vscode-copilot` |
| `client_version` | `0.43.6` |
| `client_source` | `protocol` / `env` / `none` |
| `db_scope` | `global` / `project` |
| `project_path` | resolved absolute path, or null |
| `duration_ms` | `12.4` |
| `args_summary` | Sanitized JSON (see below) |
| `result_summary` | Counts only, never data |
| `is_error` | `0` or `1` |

### Search ranking metrics (`search_metrics` table)

For `recall` calls, additional metrics capture the ranking pipeline:

| Column | Example |
|--------|---------|
| `tool_call_id` | FK into `tool_calls.id` |
| `query` | Search text (hashed in strict mode) |
| `channels` | `{"fts": 12, "fts_phrase": 3, "entity_exact": 1}` |
| `candidates_total` | `16` |
| `results_returned` | `5` |
| `limit_requested` | `20` |
| `score_min` | `0.32` |
| `score_max` | `0.87` |
| `score_median` | `0.61` |
| `compact` | `0` or `1` |

## What it does NOT log

- Observation content (replaced with `<N chars>`, always)
- Full result payloads — only counts (entities returned, observations stored, etc.)
- In strict mode, any identifier (entity name, query, event label, from/to)

## Client identity

The server identifies which client is calling through two mechanisms:

1. **MCP protocol** (primary) — the `initialize` handshake includes `clientInfo.name` and `clientInfo.version`. This is a required field in the MCP spec.
2. **Environment fingerprinting** (fallback) — when the protocol doesn't provide client info, the server detects the client from environment variables:

| Client | Signal |
|--------|--------|
| Kilo | `KILO=1` (version from `KILOCODE_VERSION`) |
| Claude Code | `CLAUDE_CODE_SSE_PORT` set |
| Cursor | `CURSOR_TRACE_ID` set |
| Windsurf | `WINDSURF_EXTENSION_ID` set |
| VS Code Copilot | `VSCODE_MCP_HTTP_PREFER` set non-empty |
| VS Code (unknown extension) | `TERM_PROGRAM=vscode` |

The `client_source` column tells you which mechanism fired: `protocol`, `env`, or `none`.

## Querying the data

The telemetry database is a standard SQLite file. Open it with any tool: `sqlite3`, DBeaver, Jupyter, pandas, etc.

### Example queries

**Tool usage by client:**
```sql
SELECT client_name, tool, COUNT(*) as calls,
ROUND(AVG(duration_ms), 1) as avg_ms
FROM tool_calls
GROUP BY client_name, tool
ORDER BY calls DESC;
```

**Search ranking quality (permissive mode — query is plaintext):**
```sql
SELECT query, candidates_total, results_returned,
ROUND(score_min, 3) as min, ROUND(score_max, 3) as max,
channels
FROM search_metrics
ORDER BY candidates_total DESC
LIMIT 20;
```

**Overfetch detection (candidates >> returned):**
```sql
SELECT query, candidates_total, results_returned, limit_requested,
ROUND(1.0 * results_returned / candidates_total, 2) as yield_ratio
FROM search_metrics
WHERE candidates_total > 0
ORDER BY yield_ratio ASC
LIMIT 20;
```

**Error rate by tool:**
```sql
SELECT tool, COUNT(*) as total,
SUM(is_error) as errors,
ROUND(100.0 * SUM(is_error) / COUNT(*), 1) as error_pct
FROM tool_calls
GROUP BY tool
ORDER BY error_pct DESC;
```

**Channel effectiveness:**
```sql
SELECT json_each.key as channel, COUNT(*) as appearances
FROM search_metrics, json_each(search_metrics.channels)
GROUP BY channel
ORDER BY appearances DESC;
```

## Separate database

Telemetry is stored in its own SQLite file, completely separate from `memory.db`. This means:

- Deleting the telemetry DB has zero impact on your knowledge graph
- The telemetry DB can be wiped and recreated at any time
- No foreign keys or joins between telemetry and memory data
- Telemetry uses `journal_mode=WAL` for concurrent reads while the server writes

## Init failure handling

If the telemetry path is invalid or the database can't be opened, the server prints a single warning to stderr and disables telemetry for the rest of the session. It does not retry on every call. The main `memory.db` is unaffected — telemetry failure never breaks the tool call path.

Example warning:
```
[memory] telemetry init failed (disabled for this session): unable to open database file
```

## Design rationale

**Why plaintext queries in permissive mode?** Local, single-user development: the telemetry DB lives on your machine, is gitignored, and is only readable by you. Redacting queries permissively would make the analytics useless — you can't answer "which queries produce too many candidates?" if the query text is hashed.

**Why privacy-strict mode?** For backends holding sensitive content (therapy, health, relationships, personal journaling), even local plaintext can matter: laptop loss, accidental sync-folder placement, or exported snapshots. Strict mode ensures entity names and queries never land on disk in the clear.

**Why a separate SQLite and not the memory DB?** Separation of concerns. The memory DB is your knowledge graph. Telemetry is operational data with a different lifecycle (wipe freely, aggregate, export). Mixing them risks accidentally leaking telemetry via `recall`, or losing telemetry history when the memory DB is rebuilt.

**Why not an MCP tool?** Telemetry is infrastructure, not a capability the model needs. Adding a tool would cost context tokens on every call for something only the human developer uses. Query the SQLite directly.

**Why a nil-tolerant `*Client`?** The alternative is `if TELEMETRY_ENABLED` checks sprinkled across every call site. The `nil`-receiver pattern keeps the dispatch code clean — the wrapper always calls `LogToolCall`; when telemetry is disabled, the client is `nil` and the method returns immediately.
Loading
Loading