Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
471 changes: 9 additions & 462 deletions package-lock.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
"url": "https://github.com/memrok-com/memrok.git"
},
"devDependencies": {
"@types/node": "^24.0.0",
"tsup": "^8.5.1",
"vitest": "^4.1.2"
}
Expand Down
46 changes: 37 additions & 9 deletions packages/daemon/src/watcher.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import { watch, type FSWatcher } from 'chokidar';
import { readFileSync, writeFileSync, statSync, openSync, readSync, closeSync } from 'node:fs';
import { homedir } from 'node:os';
import { resolve } from 'node:path';
import { EventEmitter } from 'node:events';
import type { WatcherConfig, CursorState } from './types.js';
Expand All @@ -22,7 +23,10 @@ export class TranscriptWatcher extends EventEmitter {
super();
this.config = config;
this.debounceMs = config.debounceMs ?? DEFAULT_DEBOUNCE_MS;
this.cursorPath = cursorPath ?? resolve(process.cwd(), '.memrok-cursors.json');
const baseDir = process.env.OPENCLAW_DATA_DIR
|| process.env.OPENCLAW_STATE_DIR
|| resolve(homedir(), '.openclaw');
this.cursorPath = cursorPath ?? resolve(baseDir, '.memrok-cursors.json');
this.loadCursors();
}

Expand All @@ -43,29 +47,53 @@ export class TranscriptWatcher extends EventEmitter {
return { ...this.cursors };
}

readNewContent(filePath: string): string | null {
const offset = this.cursors[filePath] ?? 0;
let size: number;
setCursor(filePath: string, offset: number): void {
this.cursors[filePath] = Math.max(0, offset);
}

getFileSize(filePath: string): number | null {
try {
size = statSync(filePath).size;
return statSync(filePath).size;
} catch {
return null;
}
}

if (size <= offset) return null;
readContentFromOffset(filePath: string, offset: number): { content: string | null; nextOffset: number } | null {
const size = this.getFileSize(filePath);
if (size === null) {
return null;
}

if (size <= offset) {
return { content: null, nextOffset: offset };
}

const fd = openSync(filePath, 'r');
try {
const length = size - offset;
const nextOffset = size;
const length = nextOffset - offset;
const buf = Buffer.alloc(length);
readSync(fd, buf, 0, length, offset);
this.cursors[filePath] = size;
return buf.toString('utf-8');
return {
content: buf.toString('utf-8'),
nextOffset,
};
} finally {
closeSync(fd);
}
}

readNewContent(filePath: string): string | null {
const offset = this.cursors[filePath] ?? 0;
const result = this.readContentFromOffset(filePath, offset);
if (!result?.content) {
return null;
}
this.cursors[filePath] = result.nextOffset;
return result.content;
}

start(): void {
if (this.fsWatcher) return;

Expand Down
63 changes: 28 additions & 35 deletions packages/openclaw-plugin/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,12 +63,6 @@ Then activate Memrok as your context engine. Add to your `openclaw.json` (or use
"memrok": {
"enabled": true,
"config": {
"scribeProvider": "openai",
"scribeModel": "gpt-5-mini",
"reflection": {
"provider": "openai",
"model": "gpt-5"
},
"bootstrap": {
"enabled": false
}
Expand All @@ -79,45 +73,41 @@ Then activate Memrok as your context engine. Add to your `openclaw.json` (or use
}
```

Restart OpenClaw. Memrok watches session transcripts automatically and begins curating after the first idle window.

Set both transcript and reflection provider/model explicitly in the plugin config if you do not want Memrok falling back to its built-in defaults.
Bootstrap is now **opt-in**. Enable it only if you explicitly want Memrok to seed itself from existing Markdown memory files.
Config lives under `plugins.entries.memrok.config`. Install puts the plugin in place, but your running gateway still needs a restart before it begins using the new build.

## Inspection & Evaluation
Restart OpenClaw. Memrok watches session transcripts automatically and begins curating after the first idle window.

Memrok ships with local inspection scripts so you can evaluate injected headers without writing back into Memrok state.
By default, Memrok uses the OpenClaw provider/model configuration already active in your runtime. Set transcript or reflection provider/model explicitly only when you want Memrok to diverge from the OpenClaw defaults.
Bootstrap is **opt-in**. Enable it only if you explicitly want Memrok to seed itself from existing Markdown memory files. When enabled, Memrok scans `MEMORY.md` and `memory/` across configured OpenClaw agents, not just the current workspace.
By default, Memrok stores its database and status files under the active OpenClaw state directory, typically `~/.openclaw/plugins/memrok/`.

Examples:
## Commands

```sh
node scripts/eval-sessions.mjs --all-sessions --dry-run --json
node scripts/eval-sessions.mjs --recent-sessions 10 --dry-run --headers
node scripts/eval-sessions.mjs --session-id <session-id> --dry-run --headers
```
Memrok also registers a `/memrok` command for operator tasks:

Notes:
- `--dry-run` / `--no-persist` prevents working-set snapshot writes during probing.
- `--json` includes the full rendered header as `headerText`.
- `--headers` is for human-readable terminal/file output.
- Optional filters such as `--topic`, `--channel`, `--provider`, and `--label` narrow session selection without changing the session-first model.
- `/memrok status` shows the current database path, watch targets, discovered memory targets, and recent Memrok activity
- `/memrok scan-memory` scans configured `MEMORY.md` files and `memory/` directories now
- `/memrok scan-memory force` reruns memory bootstrap even for files that were already bootstrapped
- `/memrok flush-sessions` runs transcript scribing immediately for any pending session chunks already seen by the watcher
- `/memrok index-sessions` replays unread session JSONL deltas from watched session paths
- `/memrok index-sessions full` rescans full watched session JSONL files from disk

## Privacy & Data Flow

Memrok is local-first, but not magically offline.

- **Local database:** Memrok stores memory in a local SQLite database at `~/.memrok/memrok.db` by default.
- **Transcript and file access:** it watches OpenClaw session directories and any configured `watchPaths`. If bootstrap is enabled, it may also scan workspace Markdown files.
- **Local database:** Memrok stores memory in a local SQLite database under the active OpenClaw state directory, typically `~/.openclaw/plugins/memrok/memrok.db`.
- **Transcript and file access:** it watches OpenClaw session directories by default and any configured `watchPaths`. If bootstrap is enabled, it may also scan `MEMORY.md` and `memory/` across configured OpenClaw agents.
- **Default posture:** bootstrap is disabled by default; broad file scanning should be an explicit choice.
- **Remote model providers:** if you configure a remote provider for scribe passes, transcript and file content will be sent to that provider as part of normal operation.
- **Risk controls:** narrow `watchPaths`, disable bootstrap if you do not want broad file scanning, prefer local models where available, and consider disabling the reflective scribe if you want to minimize exfiltration risk.
- **Operational hygiene:** treat `~/.memrok/memrok.db` as sensitive data; back it up and secure it accordingly.
- **Operational hygiene:** treat `~/.openclaw/plugins/memrok/memrok.db` as sensitive data; back it up and secure it accordingly.

## Hardened / low-exfiltration posture

If you want a stricter setup:

- set `scribeProvider` / `scribeModel` to a local provider when possible
- keep OpenClaw’s default provider/model local when possible, or set `scribeProvider` / `scribeModel` explicitly for Memrok
- keep `bootstrap.enabled` off unless you explicitly need seeding from Markdown files
- narrow `watchPaths` to only what Memrok should ingest
- disable or narrow reflection if you want less model-side synthesis
Expand Down Expand Up @@ -146,29 +136,32 @@ packages/

### Configuration

Most options are optional, but scribe provider/model should be set explicitly through OpenClaw config instead of relying on Memrok-owned defaults.
Most options are optional. Memrok inherits OpenClaw’s default provider/model unless you override them here.

| Option | Type | Default | Description |
| -------------------------- | -------- | -------------- | ---------------------------------------- |
| `dbPath` | string | state dir | Path to the SQLite database |
| `scribeProvider` | string | none | Model provider for the transcript scribe; set explicitly in OpenClaw config |
| `scribeModel` | string | none | Model for the transcript scribe; set explicitly in OpenClaw config |
| `watchPaths` | string[] | session dirs | Additional transcript paths to watch |
| `dbPath` | string | `${OPENCLAW_STATE_DIR}/plugins/memrok/memrok.db` | Path to the SQLite database |
| `scribeProvider` | string | OpenClaw default | Override provider for the transcript scribe |
| `scribeModel` | string | OpenClaw default | Override model for the transcript scribe |
| `watchPaths` | string[] | auto-detected OpenClaw session dirs | Additional transcript paths to watch |
| `bootstrap.enabled` | boolean | false | Opt in to seeding from existing Markdown memory files |
| `bootstrap.scanConfiguredAgents` | boolean | true | Scan `MEMORY.md` and `memory/` across configured OpenClaw agents |
| `bootstrap.memoryDirs` | string[] | auto-discovered agent memory dirs | Extra memory directories to seed from |
| `bootstrap.memoryIndexes` | string[] | auto-discovered agent `MEMORY.md` files | Extra `MEMORY.md` files to seed from |
| `deltaThreshold` | number | 20 | Messages before triggering consolidation |
| `idleMinutes` | number | 15 | Quiet time required before scribe runs |
| `tokenBudget` | number | 1000 | Max tokens for injected memory headers |
| `reflection.enabled` | boolean | true | Enable the reflective scribe |
| `reflection.deltaPasses` | number | 5 | Transcript passes between reflections |
| `reflection.cooldownHours` | number | 24 | Minimum hours between reflection runs |
| `reflection.model` | string | scribeModel | Override model for reflection; otherwise inherits explicit transcript model |
| `reflection.provider` | string | scribeProvider | Override provider for reflection; otherwise inherits explicit transcript provider |
| `reflection.model` | string | OpenClaw / scribe model | Override model for reflection |
| `reflection.provider` | string | OpenClaw / scribe provider | Override provider for reflection |

## Status

Deployed as an OpenClaw context engine plugin with dual-scribe architecture. 93 tests across the monorepo.

Memrok also writes a small health snapshot to `~/.memrok/memrok.status.json`, including recent transcript-scribe, reflective-scribe, and injection activity plus last error and node count.
Memrok also writes a small health snapshot under the OpenClaw state dir, typically `~/.openclaw/plugins/memrok/memrok.status.json`, including recent transcript-scribe, reflective-scribe, and injection activity plus last error and node count.

For the full technical design, see [`docs/architecture.md`](docs/architecture.md).

Expand Down
30 changes: 23 additions & 7 deletions packages/openclaw-plugin/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,30 +13,46 @@ Persistent memory layer for OpenClaw AI agents. Watches conversation transcripts
openclaw plugins install clawhub:memrok
```

Memrok configuration lives under `plugins.entries.memrok.config`. After install, restart the OpenClaw gateway so the running process picks up the plugin.

## Configuration

All options are optional — defaults work without tuning.
All options are optional. Memrok uses OpenClaw's configured provider/model by default and stores its local state under the active OpenClaw state directory, typically `~/.openclaw/plugins/memrok/`.

| Option | Description | Default |
|--------|-------------|---------|
| `scribeProvider` | LLM provider for knowledge extraction | `anthropic` |
| `scribeModel` | Model for scribe passes | `claude-sonnet-4-6` |
| `scribeProvider` | Override provider for knowledge extraction | OpenClaw default |
| `scribeModel` | Override model for scribe passes | OpenClaw default |
| `watchPaths` | Directories to watch for transcript changes | auto-detected session dirs |
| `bootstrap.enabled` | Scan `MEMORY.md` and `memory/` at startup | `false` |
| `bootstrap.scanConfiguredAgents` | Include all configured OpenClaw agents in bootstrap scans | `true` |
| `bootstrap.memoryDirs` | Additional memory directories to scan | auto-discovered agent memory dirs |
| `bootstrap.memoryIndexes` | Additional `MEMORY.md` files to scan | auto-discovered agent `MEMORY.md` files |
| `tokenBudget` | Max tokens for injected context header | `1000` |
| `deltaThreshold` | Message count before triggering scribe | `20` |
| `idleMinutes` | Quiet time required before scribe runs | `15` |

## Commands

- `/memrok status` shows current Memrok paths, targets, and recent activity
- `/memrok scan-memory` scans configured Markdown memory sources now
- `/memrok scan-memory force` rescans already-bootstrapped Markdown memory sources
- `/memrok flush-sessions` runs transcript scribing immediately for pending watcher chunks
- `/memrok index-sessions` indexes unread watched session JSONL deltas from disk
- `/memrok index-sessions full` replays full watched session JSONL files from disk

## What It Does

1. **Watches** OpenClaw session transcript files for changes
2. **Extracts** knowledge via scribe passes (entities, relationships, preferences, patterns)
3. **Stores** in a local SQLite knowledge graph
4. **Injects** relevant context as a header into every agent session turn
2. **Discovers** configured OpenClaw agent workspaces for bootstrap scans when enabled
3. **Extracts** knowledge via scribe passes (entities, relationships, preferences, patterns)
4. **Stores** in a local SQLite knowledge graph under the OpenClaw state dir
5. **Injects** relevant context as a header into every agent session turn

## Requirements

- OpenClaw v0.30+
- A configured LLM provider for scribe (uses the OpenClaw runtime's model plumbing)
- A configured LLM provider for scribe unless you override Memrok with explicit provider/model values

## Links

Expand Down
Loading