Proposal: JSON-backed repository for zero-infrastructure VRE

## Summary

Explore implementing a JSON-backed repository as an alternative to Neo4j, enabling `pip install vre` and a working demo with zero infrastructure. This is a proposal for discussion — not a committed direction.

## Problem Statement

VRE currently requires a running Neo4j instance for any usage. This creates a significant barrier to entry:
- New users must install and configure Neo4j before they can experiment with VRE
- Simple demos, tests, and prototyping require database infrastructure
- CI/CD pipelines need Neo4j services or containers
- The "try VRE in 5 minutes" experience doesn't exist

Neo4j is the right choice for production workloads — native graph traversal, relationship indexing, and Cypher queries are genuinely valuable for complex epistemic graphs. But requiring it from day one makes VRE feel heavier than it needs to be for exploration and adoption.

## Prior Art

The test suite already uses a stub repository (`StubRepository`) that implements the repository interface in-memory. This stub is the direct inspiration for this proposal — it demonstrates that VRE's core engine (grounding, policy, learning) works correctly without Neo4j. The question is whether to formalize and extend that pattern into a persistent, user-facing backend.

## Proposed Idea

**Note: This is an idea being explored, not a definitive architectural direction.**

The proposal has three parts:

### 1. Repository Protocol/ABC

Extract the implicit repository interface from `PrimitiveRepository` into an explicit `Repository` protocol or ABC:

```python
class Repository(ABC):
    @abstractmethod
    def get(self, concept: str) -> Primitive | None: ...
    @abstractmethod
    def save(self, primitive: Primitive) -> None: ...
    @abstractmethod
    def get_related(self, primitive_id: str, relation_type: RelationType) -> list[Relatum]: ...
    # ... etc
```

This formalizes what VRE expects from its storage layer and makes backend-swappability a first-class property.

### 2. Refactor PrimitiveRepository

Refactor `PrimitiveRepository` (the Neo4j implementation) to implement the new `Repository` ABC. Existing behavior is unchanged — this is a structural refactor, not a behavioral one.

### 3. JsonRepository

Build a `JsonRepository` that implements `Repository` backed by an in-memory dict persisted to a JSON file:

- On init, load the JSON file into memory (or start empty)
- All reads are dict lookups — fast, no network
- Writes update the in-memory dict and flush to disk
- **BFS-based subgraph resolution**: The Neo4j repository uses Cypher's native graph traversal to resolve subgraphs during grounding. The JSON repository will need to implement BFS over its in-memory dict structure to replicate this — walking relata from anchor nodes to build the same resolved subgraph that Neo4j returns natively. This is the most significant implementation challenge.

This makes the following possible:
```bash
pip install vre
python -c "from vre import VRE, JsonRepository; vre = VRE(JsonRepository('my_graph.json')); ..."
```

Neo4j becomes the production upgrade path, not the entry requirement.

## VRE Design Alignment

- **Minimal footprint**: CLAUDE.md Section 8 emphasizes minimal dependencies. A JSON backend removes the Neo4j requirement for getting started.
- **VRE contract preserved**: The agent–VRE contract (Section 5) is with VRE, not with Neo4j. Swapping the storage backend does not change the epistemic guarantees — grounding, depth gating, and policy evaluation work identically.
- **Inspectability**: A JSON file is arguably *more* inspectable than a Neo4j database — users can open it in any text editor.
- **Technology stack**: Section 8.2 specifies Neo4j for its graph properties. A JSON backend would not replace Neo4j for production use — it complements it as an on-ramp.

## Design Considerations

- **BFS correctness**: The BFS implementation must produce identical subgraphs to Neo4j's Cypher traversal. The existing test suite (which uses `StubRepository`) provides the correctness baseline — if all tests pass with `JsonRepository`, the traversal is equivalent.
- **Performance**: JSON repository will not scale to large graphs. This is acceptable — it's for exploration, demos, and small projects. The docs should be clear about when to graduate to Neo4j.
- **Graph traversal**: Without native graph indexing, BFS traversal will be O(n). Acceptable for small graphs, but the performance cliff should be documented.
- **Concurrency**: JSON file writes are not concurrent-safe. Single-agent usage only, or add file locking.
- **Schema evolution**: JSON files need a version field so future VRE versions can migrate old graph files.
- **What about SQLite?**: SQLite is another zero-infrastructure option with better query capabilities. Worth considering, but JSON is simpler and more inspectable for a first pass.

## Open Questions

- Is this the right abstraction boundary? Should the Repository ABC mirror the current `PrimitiveRepository` interface exactly, or should it be redesigned?
- Should the JSON file format match the Neo4j schema (node/relationship structure) or use a more natural JSON representation?
- How should seed scripts work across backends — same scripts with backend-agnostic calls, or separate seed formats?
- Should this be a separate package (`vre-json`) or included in the core `vre` package?
- Is there interest in other lightweight backends (SQLite, DuckDB) beyond JSON?
- Can the existing `StubRepository` from tests be promoted directly, or does it need significant rework for persistence and BFS?

## Dependencies

None — this can be built independently, though it would benefit from a clean Repository interface extracted first.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: JSON-backed repository for zero-infrastructure VRE #38

Summary

Problem Statement

Prior Art

Proposed Idea

1. Repository Protocol/ABC

2. Refactor PrimitiveRepository

3. JsonRepository

VRE Design Alignment

Design Considerations

Open Questions

Dependencies

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Proposal: JSON-backed repository for zero-infrastructure VRE #38

Description

Summary

Problem Statement

Prior Art

Proposed Idea

1. Repository Protocol/ABC

2. Refactor PrimitiveRepository

3. JsonRepository

VRE Design Alignment

Design Considerations

Open Questions

Dependencies

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions