Skip to content

fix: MCP tools query nonexistent KuzuDB relationship tables#58

Open
zm2231 wants to merge 4 commits intoharshkedia177:mainfrom
zm2231:fix/mcp-kuzu-rel-table-names
Open

fix: MCP tools query nonexistent KuzuDB relationship tables#58
zm2231 wants to merge 4 commits intoharshkedia177:mainfrom
zm2231:fix/mcp-kuzu-rel-table-names

Conversation

@zm2231
Copy link
Copy Markdown

@zm2231 zm2231 commented Mar 18, 2026

Rebased PR #58 onto latest upstream main and extended it to fully support the configured remote embedding provider end-to-end.

This now:

  • uses the configured HTTP embedding provider for Axon
  • records the effective embedding model and dimensions in .axon/meta.json
  • updates Kuzu embedding schema and vector casts to match configured dimensions
  • re-embeds when model or dimensions change
  • refreshes embedding config correctly in long-lived processes

Verified with:

  • Axon tests passing locally
  • kg2 reindexed successfully with BAAI/bge-large-en-v1.5 at 1024 dimensions
  • pi-ult reindexed successfully with the same provider/model/dimensions

Operational note:

  • long reindexes should be launched as managed background jobs
  • query/tool usage still goes through the Axon MCP/tool path

@zm2231
Copy link
Copy Markdown
Author

zm2231 commented Mar 21, 2026

Addressed the remote embedding parity issue and tightened the HTTP backend contract.

What was wrong

The remote backend returns plain Python lists, but the incremental embed path still assumed ndarray-like vectors and called vector.tolist(). That caused the repeated Incremental embedding failed errors in watch mode.

What changed

  1. Normalize both local + remote embeddings through the same validation path
  2. Enforce embedding count matches the input batch
  3. Sort/validate HTTP response index values before use
  4. Reject non-numeric / non-finite embedding values
  5. Enforce consistent embedding dimensions, including across HTTP batches
  6. Add regression tests covering local + HTTP mismatch cases

Verified

  • PYTHONPATH=src pytest tests/core/test_embedder.py -q → 36 passed
  • PYTHONPATH=src pytest tests/core/test_watcher.py -q → 21 passed

This should eliminate the repeated watcher failures caused by the remote provider returning plain lists, and makes the HTTP path behave much more like the local fastembed contract.

Also debugged the restart issue locally: the immediate crash after killing the runaway host was a bad Kuzu WAL, not the base DB file. Removing the WAL let the host start cleanly again.

zm2231 added 4 commits March 27, 2026 17:42
The MCP tools in tools.py query COUPLED_WITH, MEMBER_OF, and
STEP_IN_PROCESS as standalone KuzuDB relationship tables, but
kuzu_backend.py stores all relationships in a single CodeRelation
table group with rel_type as a string property.

When a repo has few/no edges of a given type, KuzuDB never creates
the table, causing 'Table X does not exist' errors in:
- axon_file_context
- axon_coupling
- axon_communities
- axon_explain

Fix: replace the relationship table names with CodeRelation and
filter by rel_type property where needed. For MEMBER_OF and
STEP_IN_PROCESS, the target node type (Community/Process) already
disambiguates without an explicit rel_type filter.
@zm2231 zm2231 force-pushed the fix/mcp-kuzu-rel-table-names branch from d9aa6d1 to 75f19c8 Compare March 27, 2026 23:54
@zm2231
Copy link
Copy Markdown
Author

zm2231 commented Mar 28, 2026

Updated this PR branch with the rebased/fixed version. It now includes the original MCP relationship-table fix plus follow-up commits for end-to-end remote embedding support, provider-native dimensions, and environment refresh in long-lived processes. Please review the refreshed PR body for the current behavior and verification details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant