From 78561777fbb17952fee8cb5cbdab95b600a0df53 Mon Sep 17 00:00:00 2001 From: Tirso Garcia Date: Wed, 18 Mar 2026 23:26:40 +0100 Subject: [PATCH] Add graph explorer planning docs --- docs/BUG_DEPTH_TRAVERSAL.md | 201 +++++++++++++++++++++++ docs/PLAN_GRAPH_EXPLORER.md | 244 ++++++++++++++++++++++++++++ docs/REQUIREMENTS_GRAPH_EXPLORER.md | 90 ++++++++++ 3 files changed, 535 insertions(+) create mode 100644 docs/BUG_DEPTH_TRAVERSAL.md create mode 100644 docs/PLAN_GRAPH_EXPLORER.md create mode 100644 docs/REQUIREMENTS_GRAPH_EXPLORER.md diff --git a/docs/BUG_DEPTH_TRAVERSAL.md b/docs/BUG_DEPTH_TRAVERSAL.md new file mode 100644 index 0000000..df60a02 --- /dev/null +++ b/docs/BUG_DEPTH_TRAVERSAL.md @@ -0,0 +1,201 @@ +# Bug: Graph Traversal Limited to Depth 1 + +**Severity:** HIGH — blocks interactive graph exploration and demo rehydration phase +**Reported:** 2026-03-18 +**Affects:** GetContext, GetGraphRelationships, RehydrateSession + +## Summary + +The kernel returns only 1 level of neighbors for any graph query, regardless of graph depth or client request. The `depth` parameter in `GetGraphRelationships` is accepted but silently ignored. This makes it impossible to: + +1. Render a full task graph (e.g., root → subtasks → sub-subtasks) +2. Build an interactive graph explorer that drills into nodes +3. Show node details from Valkey when visiting a specific node + +## Reproduction + +```bash +# Neo4j has a 3-level graph: +# root → task-A → subtask-1 +# → subtask-2 +# → task-B → subtask-3 + +grpcurl -plaintext -d '{ + "root_node_id": "node:mission:engine-core-failure", + "role": "implementer", + "token_budget": 4000 +}' localhost:50054 \ + underpass.rehydration.kernel.v1alpha1.ContextQueryService/GetContext + +# Expected: 8 nodes (root + 7 descendants) +# Actual: 3 nodes (root + 2 direct children only) +``` + +## Root Cause + +Three layers enforce the 1-level limit: + +### 1. Domain Port — no depth parameter + +**File:** `crates/rehydration-domain/src/repositories/graph_neighborhood_reader.rs` + +```rust +pub trait GraphNeighborhoodReader { + fn load_neighborhood( + &self, + root_node_id: &str, + // ← missing: depth: u32 + ) -> impl Future, PortError>> + Send; +} +``` + +The port contract has no way to request deeper traversal. + +### 2. Neo4j Cypher — hardcoded 1-hop + +**File:** `crates/rehydration-adapter-neo4j/src/adapter/queries/load_neighborhood_query.rs` + +```cypher +MATCH (root:ProjectionNode {node_id: $root_node_id}) +OPTIONAL MATCH (root)-[:RELATED_TO]-(seed_neighbor:ProjectionNode) +-- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ fixed 1-hop pattern +``` + +The Cypher uses a single relationship match pattern. No variable-length path. + +### 3. Application — depth parameter is dead code + +**File:** `crates/rehydration-application/src/queries/graph_relationships.rs` + +```rust +pub struct GetGraphRelationshipsQuery { + pub node_id: String, + pub node_kind: Option, + pub depth: u32, // ← ACCEPTED + pub include_reverse_edges: bool, +} + +pub async fn execute(&self, query: GetGraphRelationshipsQuery) -> Result<...> { + let neighborhood = load_existing_neighborhood( + &self.graph_reader, + &query.node_id, // ← depth NOT passed + ).await?; +} +``` + +The gRPC transport clamps `depth` to `[1, 3]` but the value never reaches the Cypher query. + +## Call Chain + +``` +gRPC GetContext / GetGraphRelationships + ↓ +Transport: clamp_depth(request.depth) → depth=3 + ↓ +Application: GetGraphRelationshipsQuery { depth: 3, ... } + ↓ +Application::execute() → load_existing_neighborhood(graph_reader, node_id) + ↓ ↑ depth NOT forwarded +GraphNeighborhoodReader::load_neighborhood(root_node_id) ← no depth param + ↓ +Neo4j Cypher: (root)-[:RELATED_TO]-(neighbor) ← hardcoded 1-hop + ↓ +NodeNeighborhood { root, neighbors: [direct only], relations: [1-level only] } +``` + +## Required Fix + +### Option A: Variable-depth Cypher (recommended) + +Changes in 3 files: + +**1. Domain port** — add `depth` parameter: + +```rust +fn load_neighborhood( + &self, + root_node_id: &str, + depth: u32, +) -> impl Future, PortError>> + Send; +``` + +**2. Neo4j Cypher** — use variable-length path: + +```cypher +MATCH (root:ProjectionNode {node_id: $root_node_id}) +OPTIONAL MATCH path = (root)-[:RELATED_TO*1..3]-(neighbor:ProjectionNode) +WITH root, collect(DISTINCT neighbor) AS neighbors, collect(relationships(path)) AS all_rels +-- ... flatten and dedup relationships +``` + +The `*1..3` pattern traverses 1 to `depth` hops. The max should come from the clamped depth parameter. + +**3. Application** — forward depth: + +```rust +let neighborhood = self.graph_reader + .load_neighborhood(&query.node_id, query.depth) + .await?; +``` + +### Option B: Iterative client-side traversal (workaround, not recommended) + +The client calls `load_neighborhood` for each node and reconstructs the tree. Works but: +- N+1 query problem (1 gRPC call per node) +- Latency grows linearly with graph size +- Client takes on graph assembly responsibility that belongs to the kernel + +## Additional Requirement: Node Detail Lookup + +For an interactive graph explorer, the client needs to visit a node and see its full detail stored in Valkey. Today there is no standalone RPC for this. + +**Required:** A `GetNodeDetail(node_id)` RPC that reads from the Valkey detail store and returns the node's full content (description, properties, history). + +This could be a new RPC on `ContextQueryService`: + +```protobuf +rpc GetNodeDetail(GetNodeDetailRequest) returns (GetNodeDetailResponse); + +message GetNodeDetailRequest { + string node_id = 1; +} + +message GetNodeDetailResponse { + string node_id = 1; + string title = 2; + string detail = 3; + string content_hash = 4; + uint64 revision = 5; + map properties = 6; +} +``` + +## Files to Modify + +| File | Change | +|------|--------| +| `crates/rehydration-domain/src/repositories/graph_neighborhood_reader.rs` | Add `depth: u32` to trait | +| `crates/rehydration-adapter-neo4j/src/adapter/queries/load_neighborhood_query.rs` | Variable-length Cypher | +| `crates/rehydration-adapter-neo4j/src/adapter/load_neighborhood.rs` | Pass depth to query | +| `crates/rehydration-application/src/queries/graph_relationships.rs` | Forward `depth` to port | +| `crates/rehydration-application/src/queries/get_context.rs` | Forward depth (default 3) | +| `crates/rehydration-transport-grpc/src/transport/*/get_context.rs` | Map depth from request | +| All callers of `load_neighborhood` | Add depth argument | +| Proto `query.proto` | Add `GetNodeDetail` RPC | + +## Test Data in Neo4j (already seeded) + +The `node:mission:engine-core-failure` graph has 3 levels: + +``` +node:mission:engine-core-failure (root, mission, AT_RISK) + ├── node:task:diagnose-anomaly (task, done) + ├── node:task:assess-cascade (task, done) + │ ├── node:task:direct-engine-repair (task, abandoned) + │ └── node:task:hull-first-protocol (task, active) + │ ├── node:task:seal-hull (task, active) + │ ├── node:task:stabilize-power (task, pending) + │ └── node:task:repair-engine-safe (task, pending) +``` + +Querying with depth=1 returns 3 nodes. Depth=3 should return all 8. diff --git a/docs/PLAN_GRAPH_EXPLORER.md b/docs/PLAN_GRAPH_EXPLORER.md new file mode 100644 index 0000000..d23c81e --- /dev/null +++ b/docs/PLAN_GRAPH_EXPLORER.md @@ -0,0 +1,244 @@ +# Plan: Graph Explorer + +**Status:** proposed +**Priority:** P0 +**Date:** 2026-03-18 +**Related:** +- [`docs/REQUIREMENTS_GRAPH_EXPLORER.md`](./REQUIREMENTS_GRAPH_EXPLORER.md) +- [`docs/BUG_DEPTH_TRAVERSAL.md`](./BUG_DEPTH_TRAVERSAL.md) + +## Goal + +Deliver a kernel-owned read surface that supports: + +- graph traversal beyond 1 hop +- rehydration from any node in the graph +- node detail lookup for an interactive explorer +- a demoable end-to-end explorer journey against a real seeded graph + +## Decisions + +### 1. Native kernel API can evolve; compatibility API stays frozen + +The explorer should consume the native kernel API, not the frozen +`fleet.context.v1` compatibility shell. + +Implication: + +- `underpass.rehydration.kernel.v1alpha1` may add explorer-specific fields and + RPCs +- `fleet.context.v1.GetGraphRelationships.depth` stays clamped to `1..3` until + there is an explicit compatibility decision to change that contract + +### 2. Replace the hardcoded `1..3` kernel limit with a server guardrail + +The requirement says "no artificial caps", but a literal unbounded traversal is +not a safe server contract for cyclic or unexpectedly large graphs. + +Plan: + +- remove the hardcoded `1..3` limit from the kernel-owned path +- introduce a kernel-native default depth, recommended `10` +- introduce a kernel-native maximum traversal depth, recommended `25` +- clamp only at the kernel-native guardrail, not at the old compatibility bound + +### 3. `GetNodeDetail` must compose stores + +The current Valkey detail projection only stores: + +- `node_id` +- `detail` +- `content_hash` +- `revision` + +Title, kind, summary, status, labels, and properties live in Neo4j. If the +explorer wants a full node panel, `GetNodeDetail` cannot be a pure Valkey read +with the current model. + +Plan: + +- `GetNodeDetail` will compose Neo4j node projection + Valkey node detail +- Valkey remains the source for expanded detail text and detail revision +- Neo4j remains the source for title and properties unless the projection model + is expanded later + +## Phase 1: Traversal Foundation + +### Outcome + +The kernel can load multi-hop neighborhoods and all bundle-producing read paths +stop being limited to 1 hop. + +### Scope + +- add `depth` to `GraphNeighborhoodReader` +- propagate depth through: + - `GetGraphRelationships` + - `GetContext` + - `RehydrateSession` + - bundle snapshot reads + - diagnostics reads +- update Neo4j traversal query to variable-depth paths +- deduplicate relationships across multi-hop expansion + +### Files + +- `crates/rehydration-domain/src/repositories/graph_neighborhood_reader.rs` +- `crates/rehydration-adapter-neo4j/src/adapter/load_neighborhood.rs` +- `crates/rehydration-adapter-neo4j/src/adapter/queries/load_neighborhood_query.rs` +- `crates/rehydration-application/src/queries/graph_relationships.rs` +- `crates/rehydration-application/src/queries/node_centric_projection_reader.rs` +- `crates/rehydration-application/src/queries/get_context.rs` +- `crates/rehydration-application/src/queries/rehydrate_session.rs` +- `crates/rehydration-application/src/queries/bundle_snapshot.rs` +- `crates/rehydration-application/src/queries/rehydration_diagnostics.rs` + +### Tests + +- unit test: `GetGraphRelationships` forwards requested depth +- unit test: bundle reads use the kernel default depth +- Neo4j integration test: depth `1` vs `3` returns different neighborhood sizes +- integration test: a 3-level seeded graph is fully returned at depth `3` + +## Phase 2: Native Explorer Read Contract + +### Outcome + +The native kernel query surface exposes depth explicitly for explorer-style +reads, without changing the frozen compatibility shell. + +### Scope + +- add `depth` to native `GetContextRequest` +- treat `depth=0` as "use kernel default depth" +- keep compatibility `GetContext` unchanged +- keep compatibility `GetGraphRelationships.depth` clamped to `1..3` +- add shared transport helpers for native depth default and max guardrail + +### Files + +- `api/proto/underpass/rehydration/kernel/v1alpha1/query.proto` +- `crates/rehydration-transport-grpc/src/transport/query_grpc_service.rs` +- `crates/rehydration-transport-grpc/src/transport/tests.rs` +- native request/response fixtures under `api/examples` if needed + +### Tests + +- gRPC transport test: `GetContext(depth=0)` uses kernel default depth +- gRPC transport test: `GetContext(depth=N)` forwards the requested depth +- regression test: compatibility `GetGraphRelationships` still clamps to `1..3` + +## Phase 3: `GetNodeDetail` + +### Outcome + +The explorer can open any node and retrieve a full node panel in one RPC. + +### Scope + +- add `GetNodeDetail` to native `ContextQueryService` +- define a response shape that reflects current data ownership +- read node projection metadata from Neo4j +- read expanded detail text from Valkey +- return `NOT_FOUND` when the node does not exist in either store + +### Proposed response + +```protobuf +rpc GetNodeDetail(GetNodeDetailRequest) returns (GetNodeDetailResponse); + +message GetNodeDetailRequest { + string node_id = 1; +} + +message GetNodeDetailResponse { + string node_id = 1; + string node_kind = 2; + string title = 3; + string summary = 4; + string status = 5; + repeated string labels = 6; + map properties = 7; + string detail = 8; + string content_hash = 9; + uint64 revision = 10; +} +``` + +### Files + +- `api/proto/underpass/rehydration/kernel/v1alpha1/query.proto` +- `crates/rehydration-application/src/queries/*` +- `crates/rehydration-transport-grpc/src/transport/query_grpc_service.rs` +- new query adapter or application composition around Neo4j + Valkey readers + +### Tests + +- unit test: node exists in both stores +- unit test: node exists in Neo4j but has no Valkey detail yet +- unit test: missing node returns not found +- transport test: `GetNodeDetail` maps success and not-found correctly + +## Phase 4: Explorer E2E and Demo + +### Outcome + +The repo proves the explorer journey against a real, multi-level graph. + +### Scope + +- add a deeper seed than the current starship path +- prove: + - full graph load from root + - zoom into mid-level node + - leaf rehydration + - node detail lookup + - rendered context changes when root changes + +### Suggested scenario + +Use a graph with: + +- at least 4 levels +- sibling branches +- cross-branch dependency edges +- detail content for root, mid-level, and leaf nodes + +The current starship incident graph is a good seed base, but it should be +extended to include a deeper subtree specifically for explorer navigation. + +### Tests + +- container-backed integration test for multi-hop `GetContext` +- container-backed integration test for `GetNodeDetail` +- cluster demo script for: + - root load + - zoom to mid-level node + - open detail panel + - zoom to leaf node + +## Out Of Scope + +- changing the external `fleet.context.v1` contract in this slice +- adding write-side explorer mutations +- moving all node metadata into Valkey +- UI implementation in this repo + +## Risks + +- variable-depth traversal can blow up quickly on dense graphs if the server + guardrail is too high +- cycles and cross-links can multiply relationship rows if the query is not + deduplicated carefully +- changing compatibility depth behavior in the same slice would create an + avoidable migration blast radius +- a pure-Valkey `GetNodeDetail` is not realistic with the current projection + model + +## Exit Criteria + +- native `GetContext` supports explicit depth +- bundle-producing reads no longer stop at 1 hop +- native `GetNodeDetail` exists and is transport-tested +- explorer scenario passes on a deep seeded graph +- compatibility depth behavior remains unchanged unless separately approved diff --git a/docs/REQUIREMENTS_GRAPH_EXPLORER.md b/docs/REQUIREMENTS_GRAPH_EXPLORER.md new file mode 100644 index 0000000..8d90a3d --- /dev/null +++ b/docs/REQUIREMENTS_GRAPH_EXPLORER.md @@ -0,0 +1,90 @@ +# Requirements: Deep Graph Traversal & Rehydration from Any Node + +**Priority:** P0 — blocks demo and interactive graph explorer +**Date:** 2026-03-18 +**Related:** +- [`docs/BUG_DEPTH_TRAVERSAL.md`](./BUG_DEPTH_TRAVERSAL.md) +- [`docs/PLAN_GRAPH_EXPLORER.md`](./PLAN_GRAPH_EXPLORER.md) + +## Context + +We are building an interactive application that lets a user navigate the full task graph visually — drilling into any node at any depth, viewing its details from Valkey, and rehydrating context from that point. The current kernel limits all queries to depth 1, which makes this impossible. + +## Requirement 1: Unlimited Depth Traversal + +The kernel must traverse the graph to **any depth** the client requests. No artificial caps. + +- If the graph has 10 levels, depth=10 must return all 10 levels. +- The current `clamp(1, 3)` in the transport layer must be removed. +- The Cypher query must use variable-length paths (`*1..N`), not a hardcoded single-hop pattern. +- The `GraphNeighborhoodReader` port must accept a `depth` parameter and the Neo4j adapter must honor it. +- `GetContext` and `GetGraphRelationships` must both support depth. + +A practical default (e.g., depth=10) is fine if no depth is specified, but the client must be able to override it. + +## Requirement 2: Rehydrate from Any Node + +`GetContext` must accept **any node** as `root_node_id` — not just top-level mission or story nodes. + +- If I pass a subtask node ID, the kernel rehydrates from that subtask downward: its children, its relationships, its details, its context bundle. +- If I pass a leaf node with no children, the kernel returns that single node with its full detail. +- This enables an interactive explorer where the user clicks on any node and gets its rehydrated context — as if that node were the root of a smaller graph. +- The rendered context (`RenderedContext`) should reflect the subgraph from that node, not the full tree from the original root. + +## Requirement 3: Node Detail Lookup from Valkey + +A standalone RPC to fetch the full detail of a single node from the Valkey detail store. + +When a user visits a node in the explorer, they need: +- Title, description/detail content +- All properties (as stored in `properties_json`) +- Content hash and revision +- Any history or timeline data associated with that node + +Proposed proto: + +```protobuf +rpc GetNodeDetail(GetNodeDetailRequest) returns (GetNodeDetailResponse); + +message GetNodeDetailRequest { + string node_id = 1; +} + +message GetNodeDetailResponse { + string node_id = 1; + string title = 2; + string detail = 3; + string content_hash = 4; + uint64 revision = 5; + map properties = 6; +} +``` + +This RPC reads directly from the Valkey detail store — it does not need to touch Neo4j. + +## Use Cases + +### Interactive Graph Explorer +1. User opens the explorer → `GetContext(root, depth=MAX)` → renders full tree +2. User clicks on a node → `GetNodeDetail(node_id)` → shows content panel from Valkey +3. User wants to "zoom in" on a subtask → `GetContext(subtask_id, depth=MAX)` → re-renders tree from that point +4. User navigates back up → `GetContext(parent_id, depth=MAX)` → re-renders from parent + +### Demo Phase 7 (Rehydration) +1. TUI calls `GetContext("node:mission:engine-core-failure", role="implementer", depth=3)` → gets full 8-node task graph +2. TUI renders the tree with real statuses (done, abandoned, active, pending) +3. Ship's Log shows "KERNEL: GetContext: 8 nodes, 7 rels, N tokens" — all real + +### Agent Context Rehydration +1. Agent is assigned to `node:task:seal-hull` → calls `GetContext("node:task:seal-hull")` → gets context for that specific task +2. Context includes only what's relevant to sealing the hull — not the full mission tree +3. This is the "surgical context" principle: 394 tokens, not 128,000 + +## Acceptance Criteria + +- [ ] `GetContext` with a 3-level graph returns all levels (not just root + direct children) +- [ ] `GetContext` with a leaf node as root returns that node with its detail +- [ ] `GetContext` with a mid-level node returns that node + its subtree +- [ ] `GetGraphRelationships` with `depth=N` returns N levels +- [ ] `GetNodeDetail` returns the Valkey-stored detail for any valid node ID +- [ ] No hardcoded depth caps — client controls traversal depth