feat: encrypted backup subcommand (age + VACUUM INTO)#6
Conversation
Produce end-to-end encrypted snapshots of the memory database without leaving CGO-free territory. The package has two exported symbols: - Run(ctx, sourceDB, destPath, recipients): VACUUM INTO into a temp file for a consistent SQLite snapshot, then stream through age.Encrypt to destPath (written 0600). The plaintext intermediate lives only in os.MkdirTemp and is always removed (defer + cleanup on both success and error paths). - ParseRecipients(inputs): accepts either raw age1... keys or paths to recipients files (one key per line, # comments), delegating to age.ParseX25519Recipient / age.ParseRecipients. Dependency: filippo.io/age v1.3.1 — pure Go, preserves the CGO_ENABLED=0 single-binary invariant. No CGO-based SQLCipher, no cross-platform keychain integration (see DECISION_LOG). Tests: round-trip (backup + age.Decrypt + reopen as SQLite, assert rows match), missing source, zero recipients, unwritable destination, empty paths, ParseRecipients happy + sad paths.
The new workmem backup subcommand:
workmem backup --to <path.age> --age-recipient <key-or-file> [...]
Flags:
--to destination file (required)
--age-recipient age recipient, age1... literal or path to recipients
file; repeatable, at least one required
--db source DB (optional, same resolver as serve)
--env-file .env loader, same semantics as serve
The subcommand reuses the existing DB-path resolver from mcpserver, so
the export rename resolveDBPath -> ResolveDBPath was needed. printUsage
now documents the command, its flags, and the one-liner for manual
restoration with the age CLI.
Only the global memory DB is backed up. Project-scoped DBs live in
their own workspaces and must be backed up separately with a second
invocation using --db.
- README.md: new "Backup" section between Database and Design principles. One paragraph explaining the VACUUM INTO + age approach, two CLI examples (single recipient / multi-recipient), the manual restore one-liner, and the scope statement (global DB only, telemetry excluded). - IMPLEMENTATION.md: new Step 3.4 "Encrypted backup command" with explicit Gate — round-trip test proves the encrypted snapshot decrypts back to a readable SQLite DB. Items all checked. - DECISION_LOG.md: append dated entry explaining why VACUUM INTO + pure-Go age instead of encryption-at-rest with SQLCipher + keychain, why asymmetric-only (no passphrase), why restore is left as plain age -d, and why project DBs / telemetry DB are deliberately excluded from backup scope.
There was a problem hiding this comment.
Pull request overview
Adds an encrypted backup workflow to workmem by introducing a backup CLI subcommand that creates a consistent SQLite snapshot via VACUUM INTO and streams it through age encryption, plus documentation and tests.
Changes:
- Add
internal/backuppackage implementingVACUUM INTOsnapshot +filippo.io/ageencryption, and recipient parsing. - Wire
workmem backupsubcommand and exposemcpserver.ResolveDBPathfor shared DB-path resolution. - Document backup/restore flow and record the design decision.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/mcpserver/server.go | Exports ResolveDBPath and updates runtime initialization to use it. |
| cmd/workmem/main.go | Adds backup subcommand, flags, recipient collection, and usage text. |
| internal/backup/backup.go | Implements snapshot + age encryption and recipient parsing. |
| internal/backup/backup_test.go | Adds unit tests for round-trip decrypt and error paths. |
| go.mod | Adds age-related dependencies. |
| go.sum | Adds checksums for new dependencies. |
| README.md | Documents workmem backup usage and manual restore. |
| IMPLEMENTATION.md | Marks “Encrypted backup command” step complete and describes gate. |
| DECISION_LOG.md | Records rationale and alternatives for encrypted backup approach. |
Four comments from the Copilot auto-reviewer, all verified against the code. Each addressed here: 1. backup.Run rejected when destPath resolves to sourceDB. A distracted "--to memory.db" used to overwrite and corrupt the live DB since encryptToFile truncates destPath. New guard rejectDestEqualsSource checks (a) cleaned absolute path equality and (b) os.SameFile when destPath already exists, so hard links and pre-existing symlinks to the same inode also fail loudly. The exotic "dest is a symlink to source but does not yet exist" case is left uncovered on purpose — VACUUM INTO would fail on the symlinked target anyway. 2. encryptToFile now Chmod(0o600) on the destination file right after OpenFile. OpenFile's mode argument only takes effect when the file is newly created; a pre-existing file was being truncated but kept its original (possibly looser) mode before sensitive ciphertext was written to it. On Windows Chmod has limited semantics but the call is still safe. 3. Test helper newIdentity(t) replaces six `age.GenerateX25519Identity()` sites that were discarding the error return. If rand were ever broken the old tests would have panicked on identity.Recipient() with no context; now they fail loudly via t.Fatalf. 4. go mod tidy so filippo.io/age is listed as a direct dependency (internal/backup imports it directly). Previously it was tagged `// indirect`. New tests: TestRunRejectsDestEqualToSource (direct path equality) and TestRunRejectsDestResolvingToSameFile (hard link). The hard-link test self-skips when the filesystem does not support os.Link.
Three new comments on the backup command after the first round. All legitimate: 1. encryptToFile now uses atomic write-then-rename: the ciphertext is written into a sibling ".tmp-*" file in the destination directory, fsynced, closed, and os.Rename'd onto destPath only on success. This guarantees destPath either holds the previous valid backup or the new one — never a truncated halfway state. A crash, Ctrl+C, or any error mid-encryption leaves the prior backup untouched, where the old O_TRUNC + os.Remove pattern would have destroyed it. The rename is placed in the same directory as destPath so the rename is guaranteed same-filesystem (cross-device renames are not atomic). On cleanup failure the defer removes the temp file unless the commit flag was set. 2. vacuumSnapshot now matches the main store's modernc.org/sqlite conventions: db.SetMaxOpenConns(1) and a PingContext before VACUUM INTO. Brings up open failures immediately instead of surfacing them partway through the statement. 3. runBackup now uses signal.NotifyContext(ctx, os.Interrupt, SIGTERM) instead of context.Background(), so Ctrl+C during a long VACUUM or encryption is propagated through sql.ExecContext and interrupts cleanly. New tests: - TestRunLeavesNoTempFileAfterSuccess: asserts the atomic-write cleanup doesn't leak ".tmp-*" files on the happy path. - TestRunAtomicWriteReplacesExistingBackupOnSecondRun: two sequential Runs with different identities; the final dest must decrypt with the new identity and NOT with the old one, proving rename fully replaced the prior content.
Four new comments, all legitimate. Three address real behavior gaps, one hardens a test. 1. Context propagation to encryption (backup.go): Run used to thread ctx into vacuumSnapshot and then drop it on the floor for encryptToFile, so Ctrl+C / SIGTERM would cancel the SQL VACUUM but let the encryption run to completion — surprising for a CLI that looks responsive because signal.NotifyContext is wired in main. encryptToFile now accepts ctx, checks ctx.Err() before starting, and wraps the source reader in a ctxReader so io.Copy aborts on the next Read when the context is cancelled. Temp file cleanup was already covered by the existing defer. 2. Directory fsync after rename (backup.go): the docstring claimed "fsynced + atomically renamed" durability, but the implementation only fsynced the temp file and the rename itself. On POSIX the directory entry is not durable until the containing directory is fsynced. Now calls d.Sync() + d.Close() on the destination directory right after os.Rename. Best-effort on Windows where directory Sync returns an error — the NTFS rename is durable by the time MoveFileEx/ReplaceFile returns anyway. 3. ParseRecipients disambiguation (backup.go): the old code treated every input starting with "age1" as an X25519 literal and never tried to open it as a file. That made "./age1-recipients.txt" impossible to use and produced a confusing "parse recipient" error. Now checks os.Stat first: if the input exists as a file, it is parsed as a recipients file regardless of prefix; otherwise the "age1" prefix path is tried; otherwise a clear error explains that the input is neither a file nor a literal. 4. rows.Err() check (backup_test.go): the round-trip test iterated rows.Next() without calling rows.Err() afterward. If the driver surfaced an iteration error mid-scan, the test would silently pass. Added the standard post-loop Err() assertion. New tests: - TestRunRespectsCancelledContext: pre-cancels ctx, expects Run to error and leave no dest file behind. - TestParseRecipientsPrefersFileOverAge1Prefix: writes a file named "age1-recipients.txt", proves ParseRecipients picks the file branch over the literal-recipient branch. - TestParseRecipientsRejectsNeitherFileNorLiteral: input that is neither a path nor an age1 literal fails loudly with a clear error.
Round 3 Copilot flagged that any os.Stat error falls through to literal parsing, so a misconfigured recipients file with restrictive permissions would surface as the misleading "neither an existing file nor an age1 literal" message instead of the real cause. Changes: - Distinguish fs.ErrNotExist (legitimate fall-through to age1 literal parse) from every other stat error (permission denied, I/O error, etc.) which now surfaces verbatim via "stat recipient path %q: %w". - Reject directory paths explicitly with "is a directory, not a file" so os.Open's POSIX success on a directory no longer leaks into a confusing age.ParseRecipients scan error. Tests: - TestParseRecipientsRejectsDirectory (cross-platform) proves the IsDir guard. - TestParseRecipientsSurfacesNonNotExistStatError (POSIX-gated via //go:build !windows, skipped as root) chmods a parent directory to 0o000 to produce a real EACCES stat and asserts the error surfaces with the correct prefix instead of falling through.
| os.Exit(1) | ||
| } | ||
|
|
||
| parsed, err := backup.ParseRecipients(recipients) |
There was a problem hiding this comment.
backup.ParseRecipients takes a []string, but recipients here is a named slice type (recipientFlag). As written, backup.ParseRecipients(recipients) won’t compile; convert it explicitly (e.g., to the underlying []string) or change ParseRecipients to accept the named type/interface.
| parsed, err := backup.ParseRecipients(recipients) | |
| parsed, err := backup.ParseRecipients([]string(recipients)) |
There was a problem hiding this comment.
This does compile — Go's assignability rules allow passing a named slice type where the parameter type is the underlying unnamed slice type. From the spec: "A value x of type V is assignable to a variable of type T if V and T have identical underlying types and at least one of V or T is not a named type." Here V is recipientFlag (named, underlying []string) and T is []string (unnamed), so the call is valid without an explicit conversion. Empirical proof: Test (ubuntu-latest / macos-latest / windows-latest) and all five cross-build jobs on this branch's HEAD are green, and go build ./... succeeds locally. I'd rather keep the call site natural than add a cosmetic conversion.
Summary
Adds
workmem backup— a subcommand that produces an age-encrypted, consistent snapshot of the global memory database. Addresses the cloud-backup gap identified during the telemetry discussion: iCloud Drive / Google Drive / Dropbox can't hold a live SQLite DB (WAL race with sync client), and OS-level disk crypto (FileVault/BitLocker/LUKS) only covers the laptop-spento threat model.Restore is deliberately not wrapped by the CLI — it's a plain age one-liner:
age -d -i my-identity.txt backup.age > memory.dbTechnical choices (full rationale in DECISION_LOG.md)
VACUUM INTOfor consistent snapshot — driver-agnostic, no lock on the live DB, no dependence on modernc internals.filippo.io/agev1.3.1 — pure Go, preservesCGO_ENABLED=0. No SQLCipher, no cross-platform keychain integration.age -pthe output themselves.--dbinvocation. Telemetry is operational and rebuildable, deliberately excluded.Commits
feat: add internal/backup package with filippo.io/age— package + 9 unit testsfeat: backup subcommand; export mcpserver.ResolveDBPath— wires the CLI, reuses the existing DB-path resolverdocs: backup section, Step 3.4, decision entry— README + IMPLEMENTATION + DECISION_LOGTest plan
age.Decrypt→ reopen SQLite → assert rows match), missing source, zero recipients, unwritable destination, empty paths,ParseRecipientsliteral + file + invalid + empty input.go test ./...green on darwin/arm64.workmem backup --helpsurfaces all flags and the manual-restore hint.Compatibility with PR #5 (telemetry)
This PR is independent of
feat/telemetry— branched frommain, no imports crossing over. When both merge,IMPLEMENTATION.mdwill conflict on the Step 3.x numbering (both claim 3.4). Trivial to resolve: the second-to-land PR renumbers its step to 3.5.Follow-ups
workmem restoresubcommand — not doing it (see DECISION_LOG). If enough users surface the rough edge, revisit.