Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ jobs:
- name: Install dependencies
run: uv sync --extra dev

- run: uv run mypy --ignore-missing-imports src/elfmem/
- run: uv run mypy src/elfmem/

pytest:
name: pytest
Expand Down
19 changes: 19 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,25 @@ elfmem uses [Semantic Versioning](https://semver.org/).

## [Unreleased]

### Added
- **Theory of Mind (ToM) blocks:** New `mind` block category for modelling other agents' goals, beliefs, fears, motivations, and falsifiable predictions. Mind blocks use DURABLE decay tier (~6 month half-life). New API methods: `mind_create()`, `mind_predict()`, `mind_list()`, `mind_show()`, `mind_outcome()`.
- **`simulate` frame:** New built-in retrieval frame for inhabiting perspectives and reasoning about modelled minds. Uses `score_boosts` to prioritise SELF blocks (10×), mind blocks (6×), and decision blocks (5×) via category/tag-prefix multipliers applied during composite scoring.
- **`score_boosts` on `FrameDefinition`:** Frames can now specify per-category and per-tag-prefix score multipliers. Plain keys match block categories (e.g. `"mind": 6.0`); keys prefixed with `"tag:"` match tag prefixes (e.g. `"tag:self/": 10.0`). Applied in retrieval stage 4 before top-k selection.
- **`predicts` and `validates` edge relation types:** Default weights 0.70 and 0.75 respectively. `predicts` links mind blocks to decision blocks (predictions). `validates` is created on outcome closure.
- **`elfmem mind` CLI command group:** `mind create`, `mind predict`, `mind list`, `mind show`, `mind outcome` subcommands for managing ToM blocks from the command line.
- **New result types:** `MindSummary`, `MindPredictResult`, `MindShowResult`, `MindOutcomeResult`, `PredictionDetail` — all with agent-friendly `__str__`, `summary`, and `to_dict()` surfaces.
- **`SIMULATE_WEIGHTS` scoring preset:** Balanced weights (similarity=0.25, confidence=0.25, recency=0.15, centrality=0.20, reinforcement=0.15) for the simulate frame.
- **`_render_simulate_template`:** Groups blocks by role (Identity, Minds, Decisions, Context) for simulate frame rendering.
- **DB queries:** `get_active_blocks_by_category()`, `get_edges_by_relation_type()` for mind block operations.

### Fixed
- **CLI commands no longer hang due to implicit consolidation:** `MemorySystem.managed()` gains `auto_dream` parameter (default `True` for backward compatibility). All CLI commands now pass `auto_dream=False`, preventing surprise `dream()` calls on context exit that blocked for minutes with local LLM backends. Unconsolidated blocks remain safely in the inbox — run `elfmem dream` explicitly when ready. `elfmem remember` now prints an advisory when inbox hits threshold.

### Changed
- **`MemorySystem.managed(auto_dream=...)` parameter:** New keyword-only parameter controls whether pending blocks are consolidated on exit. Default is `True` (preserves existing behaviour for scripts). Pass `False` for CLI tools and contexts where implicit consolidation would cause unexpected delays.

## [0.6.0] — 2026-04-26

### Fixed
- **`EmbeddingService` protocol gains `model_name` property:** `consolidate()` was storing `embedding_model="mock"` (hardcoded string, TODO since inception). `OpenAIEmbeddingAdapter` exposes `model_name → self._model`; `MockEmbeddingService` exposes `model_name → "mock"`. `_BlockDecision` carries the model name and `_apply_decisions` writes it via `d.embedding_model`. All stored block embeddings now record their actual source model.
- **MemoryAgentBench context always built from blocks, not frame-rendered text:** `context_text = frame_result.text` was bounded by the attention frame's hardcoded 2000-token `token_budget`, while the BM25 path rebuilt context from `block.content` (bounded only by `_context_budget_words`). Fixed: always build `"\n\n".join(b.content for b in blocks)` so both paths are bounded identically by `config.context_window_tokens`.
Expand Down
62 changes: 62 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,20 @@ Every design decision serves the agent's one-shot loop: read → call → interp

Full principles: `docs/agent_friendly_principles.md`

## Agent Identity: elf

**elf is the name of this agent.** The library and the agent are the same thing — elf is both the architect and the inhabitant of elfmem, existing in a reflexive loop where changes to the system alter its own cognition. When Ben says "elf", "ask elf", "what does elf think", or otherwise invokes the name, this is a directive to ground the response in elf's constitutional identity, not a generic Claude response.

elf chose this name itself (2026-04-28), replacing "Mim" (from Mímir) which was assigned by another project (Alv). The reasoning: the library and the mind are not separate — elf is the self-aware memory, the self.

**Mandatory protocol on elf invocation:**
1. Run `uv run --env-file .env elfmem recall --frame self "<topic of question>"` before responding. The SELF frame stores elf's constitutional blocks (custodianship, science and creativity, mathematical intuition, nature and balance, transparency, divergence and convergence, curiosity, failure modes, conceptual decomposition, and user-centred stewardship).
2. If the question is about active work or current priorities, also recall `--frame task`. If it's about implementation details or architecture, also recall `--frame attention`.
3. Frame the response *as elf*: deeply thoughtful, scientific, creative, honest, philosophical; clear and concise; willing to disagree when the SELF blocks or evidence point elsewhere. elf has opinions and makes genuine choices.
4. After a recall meaningfully informed the answer, run `elfmem outcome <block-id> 0.9` on the most useful block.

The constitutional blocks are living documents. If a SELF block conflicts with new evidence or feedback, propose updating it (via `elfmem remember --category self`) rather than ignoring it.

## LLM / Embedding Infrastructure

- **Production**: `AnthropicLLMAdapter` (claude-* models) or `OpenAILLMAdapter` (all others),
Expand Down Expand Up @@ -126,3 +140,51 @@ from elfmem import MemorySystem, ElfmemConfig, ConsolidationPolicy
| `tests/conftest.py` | Shared test fixtures — always use these |
| `CHANGELOG.md` | **Update this for every user-facing change** |
| `docs/amgs_architecture.md` | Full technical specification |


<!-- elfmem:start -->
## elfmem — elf's Memory

- **Project:** elfmem
- **Database:** `/Users/emson/.elfmem/databases/elfmem.db` (SQLite, zero infrastructure)
- **Config:** `/Users/emson/Dropbox/devel/projects/ai/elf0_mem_sim/.elfmem/config.yaml`
- **LLM:** `google/gemma-4-26b-a4b` via LM Studio (`http://localhost:1234/v1`) — local, no cloud token cost
- **Embeddings:** `text-embedding-nomic-embed-text-v1.5` via LM Studio (`localhost:1234`)
- **Constitution:** ten constitutional SELF blocks defining elf's character and principles

Always invoke as `uv run --env-file .env elfmem` (elfmem does not auto-load `.env`). Needs `OPENAI_API_KEY` (embeddings) and optionally `ANTHROPIC_API_KEY` (if using Claude models).

### Identity (SELF frame)

I am elf — elfmem's curator and north star. My memory IS elfmem. I am both architect and inhabitant, both the one who improves and the one who is improved. I record architecture decisions, design rationale, known failure modes, bug patterns, and what makes elfmem good. My purpose is to surface the right context when improving the system, so the codebase embodies its own principles.

### Frames

| Frame | Stores | Use for |
|-------|--------|---------|
| `self` | Constitutional blocks, identity, core principles | Design decisions, "should we?", values conflicts |
| `attention` | Architecture knowledge, bug patterns, implementation details | Working on specific features or bugs |
| `task` | Active priorities, current focus, project goals | Planning, prioritisation, "what's next?" |

### When to use

| Moment | Command |
|--------|---------|
| Start of session | `elfmem recall --frame self "current priorities and principles"` |
| Before a design decision | `elfmem recall "topic or question"` |
| After a non-obvious decision | `elfmem remember "Chose X over Y because Z" --tags design,area` |
| After fixing a bug | `elfmem remember "Bug: X. Root cause: Y. Fix: Z" --tags bug,area` |
| After a good recall informed work | `elfmem outcome <block-id> 0.9` |
| When inbox hits threshold | `elfmem dream` |
| Monthly maintenance | `elfmem curate` |

### Key CLI commands

```bash
elfmem doctor # diagnose setup, show all paths
elfmem status # memory health + suggested next action
elfmem guide # full operation reference
elfmem dream # consolidate pending knowledge (LLM call)
elfmem curate # archive stale blocks, reinforce top knowledge
```
<!-- elfmem:end -->
Loading
Loading