refactor(bonsai): English-verb DSL grammar + lite variant + persistent KV cache by KailasMahavarkar · Pull Request #170 · orkait/graphstore

KailasMahavarkar · 2026-04-20T16:59:45Z

Summary

Full 100% grammar.lark NL-addressable coverage (94 rules) via English-keyword @-verbs (@upsert, @belief, @Remember, @snapshot, @checkpoint, @CRON_ADD, @EVOLVE_RULE, ...). Short-code abbreviations removed from dispatch.
Two prompt variants shipping with the package: full (all 94 verbs, ~1700 tokens) and lite (16 ingest+retrieval verbs, ~800 tokens). Lite mode load drops 19s -> 8s and avoids verb-picking confusion on conversational turns.
Auto n_ctx picks the smallest power-of-two that fits the loaded prompt + user-msg budget + output + headroom. Callers can still pin explicitly.
Persistent KV cache saves/loads to disk via kv_cache_path. Cold start goes 8s -> 0.4s (19x faster) across process restarts. Meta-guarded so skill / model / n_ctx changes invalidate the cache automatically.

What changed

Parser rewritten as a factory dispatch table (_h_slug / _h_topic / _h_walk / _h_pair / _h_query / _h_plain / _h_raw) plus targeted specials for complex verbs.
compact dual-mode removed; prompt-driven is the only mode. CompactTurn -> ParsedTurn, _parse_compact_output -> _parse_verb_output, _COMPACT_HANDLERS -> _VERB_HANDLERS, _DEFAULT_COMPACT_* symbols deleted.
Prompt files moved from tools/skills/graphstore-bonsai-dsl-compact/ into src/graphstore/ so they ship with the wheel. pyproject.toml package-data now includes *.txt.
Grammar bugs fixed:
- @SNAPSHOT with no name auto-fills a UTC timestamp (grammar requires SNAPSHOT STRING).
- @COMPACT now emits SYS OPTIMIZE COMPACT (bare SYS COMPACT isn't a real rule).

Performance (AMD 9700X, DDR5-5200, 4B TQ1_0)

Peak decode 27-30 tok/s (memory-bandwidth bound at ~810 MB weight read per token)
Per-call wall 0.3-2s; overall 15-20 tok/s
Cold load 8s (lite) / 19s (full)
Cold load with persistent KV cache: 0.4s (19x faster)
@-prefix parser gate makes English drift / <think> leaks / fences inert at parser level (no DSL corruption possible)

Test plan

89 unit tests pass (pytest tests/test_bonsai_ingestor.py)
107 synthesized DSL templates parse clean vs grammar.lark
Live bench on 10 in-scope prompts: 10/10 correct on lite scope
Persistent KV cache demo measured 19x cold-start speedup
LoCoMo F1 rerun vs baseline (follow-up PR)
End-to-end smoke with live GraphStore round-trip (follow-up PR)

Generated with Claude Code.

Before: U/F/D compact verbs for the 3 ingest paths, with an `!` escape hatch to pass any other DSL through verbatim. Model had to remember two tokenizations and the escape never compressed. After: one positional verb table covering the whole common DSL surface - ingest (U/F/D), edges (E), retrieval (RM/SM/LX/AQ), walks (RL/TR/AN/SG), sys/vault (SS/SC/SH/ST/SX/VS). Python expands each to the full DSL line. Every path hits the same ~3-5x output-token reduction now, not just ingest. Changes: - Replace `_parse_compact_output` with a dispatch table of verb handlers built from small factories (_h_upsert, _h_fact, _h_drop, _h_edge, _h_query, _h_walk, _h_plain). - Swap `CompactTurn.raw_dsl` for `CompactTurn.statements`: pre-rendered DSL lines for every non-ingest verb. `_synthesize_dsl` appends them verbatim after the message node + mention wiring + fact updates. - SKILL.md rewritten to v5: 16 verbs documented, examples per common path, ~900 tokens. - Unit tests: drop `!`-escape block, add per-verb coverage for edges, all 4 retrieval verbs, all 4 walks, all 6 sys/vault ops, and the aliased long forms (REMEMBER/SIMILAR/RECALL/TRAVERSE). Test results: 78/78 bonsai unit tests pass; full suite 1880 pass, 101 skip (unchanged).

@upsert

The NL->DSL ingestor now covers 100% of the grammar.lark NL-addressable surface (94 rules) via English-keyword @-verbs. Short-code abbreviations (@U/@F/@RM/etc.) are gone - every verb is a readable DSL keyword (@upsert, @belief, @Remember, @snapshot, @checkpoint, @CRON_ADD, @EVOLVE_RULE, ...). Full dispatch has ~100 entries including grammar aliases (ASSERT->BELIEF, FORGET_NODE->FORGET, etc.). Two prompt variants now ship with the package: - bonsai_dsl_prompt.txt: full 94-verb surface, ~1700 tokens, n_ctx=4096 - bonsai_dsl_prompt_lite.txt: 16-verb ingest+retrieval subset, ~800 tokens, n_ctx=2048. Fewer competing verbs means the model picks correctly on conversational turns. Load time 19s -> 8s. n_ctx auto-picks smallest power-of-two that fits the loaded prompt + typical user-msg budget + max_output + headroom. Callers can still pin n_ctx explicitly. Parser rewritten as a factory dispatch table: - _h_slug / _h_topic / _h_walk / _h_pair / _h_query / _h_plain / _h_raw - Special handlers for update_node, merge, increment, propagate, describe, unregister, contradictions, cron_add, optimize, clear, wal, nodes, vault_triplet, snapshot (auto-timestamp fallback). Compact/raw-DSL mode removed. Single prompt-driven mode. `compact` kwarg and `_DEFAULT_COMPACT_*` symbols deleted. `CompactTurn` renamed `ParsedTurn`, `_parse_compact_output` -> `_parse_verb_output`, `_COMPACT_HANDLERS` -> `_VERB_HANDLERS`. Grammar bugs fixed in this pass: - @snapshot without name auto-fills a UTC timestamp (SNAPSHOT STRING is required by grammar; bare @ss was emitting invalid DSL). - @compact rewritten to SYS OPTIMIZE COMPACT (SYS COMPACT isn't a real grammar rule; SYS OPTIMIZE COMPACT is). Prompt file moved out of tools/skills/ into src/graphstore/ so it ships with the wheel. pyproject package-data now includes *.txt. Performance envelope measured on AMD 9700X / DDR5-5200: - Cold load: 8s (lite) / 19s (full) - Cold load with persistent kv_cache_path: 0.4s (19x faster) - Peak decode: 27-30 tok/s (memory-bandwidth bound at ~810 MB weight read per token for 4B TQ1_0) - Per-call wall: 0.3-2s; overall 15-20 tok/s Tests: 89 unit tests pass; 107 synthesized DSL templates parse clean against grammar.lark (verify_v6_templates.py check).

@retract

@retract LongMemEval smoke revealed two real drift patterns in the lite prompt: 1. Spurious @retract on unrelated turns: with KNOWN FACTS present, the model was emitting @retract + @belief even when the new turn was about a different topic entirely. The correction-flow example in the prompt was the attractor: model pattern-matched any new fact to a correction. 2. @recall misfire on personal-fact questions: "Which city did I move to last year?" emitted @recall location (wrong verb, bare anchor). Model thought the belief topic "location" was a valid walk anchor. Prompt changes: - VERB PICK RULE now distinguishes personal-fact questions ("Where did I ...?", "Which city did I ...?") which route to @answer, from named-entity connection questions which route to @recall. - Added explicit rule: walk/path verbs (@recall, @traverse, @Ancestors, @descendants, @subgraph, @path, @SHORTEST_PATH, @common) REQUIRE a prefixed anchor id (ent:X / fact:X / msg:X). Bare topic names like "location" are not valid anchors. - Added explicit rule: @retract only fires on correction trigger words ("actually", "not anymore", "changed to", "now prefer", "instead"). Unrelated new turns must NOT emit @retract even if related beliefs are in KNOWN FACTS. - New NEGATIVE example showing KNOWN FACTS [fact:location]="Seattle" plus unrelated turn "I bought a new guitar" -> only @belief purchase guitar. No retract. - Two new @answer examples for personal-fact questions: "Which city did I move to last year?" and "What is my favorite color?". Verified by re-running tools/scripts style LongMemEval smoke on the fixture: both drift cases now produce correct ops (@BELIEF-only for unrelated turns, @answer for personal-fact questions). Tests: 89/89 pass, no unit-test deltas (pure prompt change).

KailasMahavarkar added 2 commits April 20, 2026 22:29

KailasMahavarkar changed the title ~~refactor(bonsai): compact v5 unified 16-verb grammar~~ refactor(bonsai): English-verb DSL grammar + lite variant + persistent KV cache Apr 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(bonsai): English-verb DSL grammar + lite variant + persistent KV cache#170

refactor(bonsai): English-verb DSL grammar + lite variant + persistent KV cache#170
KailasMahavarkar wants to merge 3 commits intomainfrom
feat/bonsai-caveman-output

KailasMahavarkar commented Apr 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KailasMahavarkar commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Performance (AMD 9700X, DDR5-5200, 4B TQ1_0)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

KailasMahavarkar commented Apr 20, 2026 •

edited

Loading