doc work 2 by offx-zinth · Pull Request #27 · offx-zinth/SMP

offx-zinth · 2026-04-19T13:42:05Z

No description provided.

gemini-code-assist

Code Review

This pull request significantly expands and matures the documentation for the Structural Memory Protocol (SMP), providing a comprehensive API reference, architectural deep-dives, and detailed contribution guidelines. The updates transition the documentation from high-level summaries to production-ready specifications. Feedback focuses on refining the professional tone by removing LLM-generated conversational filler, eliminating redundant content between the architecture and contribution guides, and ensuring that method names and project structures accurately reflect the underlying implementation and registration logic.

gemini-code-assist · 2026-04-19T13:43:59Z

+Here are the detailed, production-ready `ARCHITECTURE.md` and `CONTRIBUTING.md` files based on the comprehensive Structural Memory Protocol (SMP) specifications. 
+
+### 1. `ARCHITECTURE.md`
+
+```markdown


This file contains LLM-generated conversational filler and nested markdown blocks. The preamble and the extra backticks should be removed to maintain a professional documentation style.

Suggested change

Here are the detailed, production-ready `ARCHITECTURE.md` and `CONTRIBUTING.md` files based on the comprehensive Structural Memory Protocol (SMP) specifications.

### 1. `ARCHITECTURE.md`

```markdown

# Architecture Guide: Structural Memory Protocol (SMP)

gemini-code-assist · 2026-04-19T13:43:59Z

+### 2. `CONTRIBUTING.md`

-The `SeedWalkEngine` implements a 4-phase pipeline to find the most relevant code for a given query.
+```markdown
+# Contributing to SMP

-### Phase 0: Route
-The query embedding is compared against the **Level-1 Community Centroids** in ChromaDB. If the confidence exceeds a threshold, the search is scoped to that specific community (~200 nodes), drastically reducing noise.
+Thank you for contributing to the Structural Memory Protocol (SMP)! To maintain the integrity, safety, and high performance of this agentic architecture, we enforce strict guidelines. 

-### Phase 1: Seed
-A vector search is performed in ChromaDB to find the top-K "seed" nodes whose signatures or docstrings most closely match the query.
+## 🛠 Development Environment

-### Phase 2: Walk
-From the seeds, the engine performs a multi-hop traversal in Neo4j, following `CALLS_STATIC`, `CALLS_RUNTIME`, and `IMPORTS` edges. This captures the structural context (who calls this? what does this call?).
+### Python Version
+SMP requires **Python 3.11** explicitly. We heavily utilize modern features like `X | Y` unions, `tomllib`, and performance optimizations not present in older versions.

-### Phase 3: Rank
-Nodes are ranked using a composite score:
-$$\text{Score} = \alpha \cdot \text{VectorSimilarity} + \beta \cdot \text{NormalizedPageRank} + \gamma \cdot \text{HeatScore}$$
- **Vector Similarity:** Relevance to the query.
- **PageRank:** Structural importance in the graph.
- **Heat Score:** Frequency of execution (from telemetry/runtime traces).
+### Setup Instructions
+1. **Create a Virtual Environment:**
+   ```bash
+   python3.11 -m venv .venv
+   source .venv/bin/activate
+   ```
+2. **Install Dependencies:**
+   ```bash
+   pip install -e ".[dev]"
+   ```
+3. **Configure Environment:**
+   Copy `.env.example` to `.env` and configure your Neo4j and ChromaDB credentials. Note that Neo4j requires the GDS (Graph Data Science) plugin for Louvain and PageRank calculations.
+
+---

-### Phase 4: Assemble
-The engine produces a ranked list of `RankedResult` objects and a `structural_map` (adjacency list) allowing the agent to visualize the call chain.
+## 🏛️ Architecture TL;DR
+Before contributing, review `ARCHITECTURE.md`. SMP uses a layered design:
+- `core/`: AST parsing, Linking (Static + eBPF), Enrichment, and persistence mapping.
+- `engine/`: Query resolution (`SeedWalkEngine`), structural aggregations, context generation.
+- `sandbox/`: MicroVM/Docker isolation, eBPF telemetry capture, and Mutation Testing.
+- `protocol/`: JSON-RPC 2.0 endpoints utilizing the Dispatcher pattern.

 ---

-## 💾 Persistence Layer
+## 📝 Coding Standards

-SMP utilizes a dual-store strategy to balance speed and structure.
+SMP is designed to be read by humans and navigated by AI agents. Predictability is paramount.

-| Store | Technology | Role | Data Held |
-| :--- | :--- | :--- | :--- |
-| **Graph Store** | **Neo4j** | Structural Truth | Entities, Relationships, Communities, PageRank, Full-Text Index. |
-| **Vector Store** | **ChromaDB** | Entry Point | Node Embeddings, Community Centroids. |
+### Imports
+- Every file must start with `from __future__ import annotations`.
+- Group imports: `stdlib` $\rightarrow$ `third-party` $\rightarrow$ `local`, separated by blank lines.
+- **Always use absolute imports** for local modules: 
+  `from smp.core.linker import StaticLinker` (Never `from ..linker import StaticLinker`).
+
+### Type Annotations & Data Models
+- **Strict Typing:** All function signatures must have full type annotations. No implicit `Any`.
+- **Modern Unions:** Use `X | Y` instead of `Optional[X]` or `Union[X, Y]`.
+- **Built-in Generics:** Use `list[...]`, `dict[...]`, `set[...]` instead of the `typing` module equivalents.
+- **Msgspec Structs:** All data flowing through the protocol and engine must be defined as `msgspec.Struct` classes with `frozen=True` to ensure zero-copy immutability and fast JSON serialization.
+
+```python
+import msgspec
+
+class RankedResult(msgspec.Struct, frozen=True):
+    node_id: str
+    node_type: str
+    vector_score: float
+    pagerank: float
+    is_seed: bool = False
+```
+
+### Naming & Style
+- **Classes:** `PascalCase`
+- **Functions/Methods:** `snake_case`
+- **Private Members:** Prefix with `_leading_underscore`.
+- **Docstrings:** Use triple double-quotes, imperative mood, and Google style. Docstrings are heavily relied upon by the Graph RAG engine, so be descriptive.
+- **Line Length:** Max 120 characters.

 ---

-## 🔌 MCP Integration
+## 🔌 Adding Protocol Methods (The Dispatcher)
+
+We do not use massive `if/elif` routers. If you are adding a new JSON-RPC endpoint to SMP, implement it in the appropriate module under `smp/protocol/handlers/` and use the `@rpc_method` decorator.
+
+```python
+# smp/protocol/handlers/telemetry.py
+from smp.protocol.dispatcher import rpc_method
+from smp.core.models import ServerContext
+
+@rpc_method("smp/telemetry/hot")
+async def handle_telemetry_hot(params: dict, ctx: ServerContext) -> dict:
+    """Returns nodes with high churn and high blast radius."""
+    window = params.get("window_days", 30)
+    return await ctx.engine.telemetry.get_hot_nodes(window)
+```
+
+---
+
+## 🔄 Development Workflow
+
+### Branching
+- `feature/description` for new functionality.
+- `fix/description` for bug fixes.
+- `docs/description` for documentation updates.
+
+### Linting & Formatting
+We use **Ruff** to enforce formatting and linting rules.
+```bash
+# Check for lint errors
+ruff check .
+
+# Automatically format code
+ruff format .
+```
+
+### Type Checking
+We rely on strict type boundaries. Run **Mypy** before committing:
+```bash
+mypy smp/
+```
+
+### Testing
+We use **pytest** combined with `pytest-asyncio` for all asynchronous graph engine tests.
+```bash
+# Run all tests
+pytest
+
+# Run a specific module
+pytest tests/engine/test_seed_walk.py
+```
+
+---
+
+## ✅ Pre-Commit Checklist
+
+Before submitting a Pull Request, ensure you have completed these steps. Pull Requests failing CI will not be reviewed.
+
+1. [ ] Read `ARCHITECTURE.md` to ensure your change fits the architectural direction.
+2. [ ] `ruff check .` — No lint errors.
+3. [ ] `ruff format .` — Code is formatted.
+4. [ ] `mypy smp/` — Zero type errors.
+5. [ ] `pytest` — All tests pass, including integration tests spanning Neo4j and ChromaDB.

-SMP implements the **Model Context Protocol (MCP)**. This allows it to serve as a "Codebase Memory Server" for any MCP-compatible client. Instead of the agent reading files blindly, it calls SMP tools to:
-1. `locate`: Find the right starting point in a massive repo.
-2. `get_context`: Get a structural summary of a file and its dependencies.
-3. `assess_impact`: Find all nodes affected by a potential change.
+For detailed agent-specific interactions and JSON-RPC payloads, refer to `PROTOCOL.md` spec.
+```


This section redundantly includes the entire content of CONTRIBUTING.md within ARCHITECTURE.md. Documentation should be modular; please remove this redundant section and rely on the standalone CONTRIBUTING.md file.

gemini-code-assist · 2026-04-19T13:43:59Z

+Here is an expanded, standalone, and highly detailed `CONTRIBUTING.md` file. It goes deeper into the specific developer workflows, testing graph databases, adding protocol methods, and code standards required for the Structural Memory Protocol (SMP).

-Thank you for contributing to the Structural Memory Protocol! To maintain high code quality and architectural consistency, please follow these guidelines.
+***

-## 🛠 Development Environment
+# Contributing to the Structural Memory Protocol (SMP)


Please remove the conversational preamble at the beginning of the file. Documentation should start directly with the title.

Suggested change

Here is an expanded, standalone, and highly detailed `CONTRIBUTING.md` file. It goes deeper into the specific developer workflows, testing graph databases, adding protocol methods, and code standards required for the Structural Memory Protocol (SMP).

Thank you for contributing to the Structural Memory Protocol! To maintain high code quality and architectural consistency, please follow these guidelines.

***

## 🛠 Development Environment

# Contributing to the Structural Memory Protocol (SMP)

# Contributing to the Structural Memory Protocol (SMP)

gemini-code-assist · 2026-04-19T13:43:59Z

+  * `proposed_content` (string).
+* **Result:** `nodes_added`, `nodes_removed`, `nodes_modified`, and relationship deltas.
+
+### `smp/graph/why`


Inconsistent method name. The implementation in RpcDispatcher and the description in README.md use smp/why, but it is documented here as smp/graph/why. Please use smp/why for consistency.

gemini-code-assist · 2026-04-19T13:44:00Z

+### `smp/linker/report`
+Lists all unresolved static edges (e.g., ambiguous calls where the target function exists in multiple files but wasn't explicitly imported).
+
+* **Params:**
+  * `scope` (string): `"full"`, `"package:<path>"`, or `"file:<path>"`.
+* **Result:** Array of `unresolved` edge definitions indicating caller and candidates.
+
+### `smp/linker/runtime`
+Retrieves all `CALLS_RUNTIME` edges for a node (captured via eBPF trace execution).
+
+* **Params:**
+  * `node_id` (string): Target node ID.
+  * `commit_sha` (string): Specific commit hash.
+* **Result:** Arrays of `runtime_callees` and `static_only_callees`.
+


The methods smp/linker/report and smp/linker/runtime are documented here but are not registered in the RpcDispatcher (see smp/protocol/dispatcher.py). Conversely, several registered methods such as smp/reindex, smp/session/recover, smp/lock, and smp/unlock are missing from this API reference. Please ensure the documentation accurately reflects the implemented and registered JSON-RPC methods.

gemini-code-assist · 2026-04-19T13:44:00Z

+│   │       ├── community.py     # smp/community/detect, list, get, boundaries
+│   │       ├── query.py         # smp/navigate, trace, context, impact, locate, flow, diff, why
+│   │       ├── enrichment.py    # smp/enrich, annotate, tag, search
+│   │       ├── safety.py        # smp/session/*, guard/check, dryrun, checkpoint, lock, audit
+│   │       ├── planning.py      # smp/plan, conflict
+│   │       ├── sandbox.py       # smp/sandbox/spawn, execute, destroy
+│   │       ├── verify.py        # smp/verify/integrity
+│   │       ├── handoff.py       # smp/handoff/review, pr
+│   │       └── telemetry.py     # smp/telemetry/*


The project structure described here is inaccurate compared to the actual file organization and handler registration in smp/protocol/dispatcher.py. For example, smp/plan and smp/conflict are in query_ext.py, not planning.py (which does not exist), and smp/annotate is in annotation.py, not enrichment.py.

Suggested change

│ │ ├── community.py # smp/community/detect, list, get, boundaries

│ │ ├── query.py # smp/navigate, trace, context, impact, locate, flow, diff, why

│ │ ├── enrichment.py # smp/enrich, annotate, tag, search

│ │ ├── safety.py # smp/session/*, guard/check, dryrun, checkpoint, lock, audit

│ │ ├── planning.py # smp/plan, conflict

│ │ ├── sandbox.py # smp/sandbox/spawn, execute, destroy

│ │ ├── verify.py # smp/verify/integrity

│ │ ├── handoff.py # smp/handoff/review, pr

│ │ └── telemetry.py # smp/telemetry/*

│ ├── memory.py # smp/update, batch_update, reindex

│ ├── merkle.py # smp/sync, merkle/tree, index/*

│ ├── query.py # smp/navigate, trace, context, impact, locate, search, flow

│ ├── query_ext.py # smp/diff, plan, conflict, why

│ ├── enrichment.py # smp/enrich/*

│ ├── annotation.py # smp/annotate/*, tag

│ ├── safety.py # smp/session/*, guard/check, dryrun, checkpoint, rollback, lock, unlock, audit, verify/integrity

│ ├── sandbox.py # smp/sandbox/*

│ ├── handoff.py # smp/handoff/*

│ └── telemetry.py # smp/telemetry/*

doc work 2

281e061

offx-zinth merged commit ec92d4e into main Apr 19, 2026
1 check failed

gemini-code-assist Bot reviewed Apr 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doc work 2#27

doc work 2#27
offx-zinth merged 1 commit intomainfrom
master

offx-zinth commented Apr 19, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 19, 2026

Uh oh!

gemini-code-assist Bot Apr 19, 2026

Uh oh!

gemini-code-assist Bot Apr 19, 2026

Uh oh!

gemini-code-assist Bot Apr 19, 2026

Uh oh!

gemini-code-assist Bot Apr 19, 2026

Uh oh!

gemini-code-assist Bot Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

offx-zinth commented Apr 19, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant