Skip to content

[Feature]: Add knowledge Graphs for Code & relational data #717

@genaray

Description

@genaray

Problem Statement

We can use semantic search to find Code parts or parts of documents. But we are Missing understanding of the Relations between those. For example: "What does break when we remove xy from the Code?"

Proposed Solution

Proposed Functionality:

Codebase Analysis:

  • Parse source code (e.g., functions, classes, modules) and generate a knowledge graph representing relationships such as calls, inheritance, and module dependencies.
  • Support multiple programming languages.

Relational/Structured Documents:

  • Automatically extract entities and their relations from documents like CSVs, JSON, YAML, or database schemas.
  • Represent these relationships in a graph that can be queried semantically.

Integration with Vector Embeddings:

  • Map nodes and edges to vector embeddings for enhanced search and reasoning.
  • Allow users to perform queries like “find all functions called by X” or “show related entities to Y in dataset Z.”
  • Support graph-based querying in addition to vector similarity searches.

Benefits:

  • Improved understanding of large codebases or datasets.
  • Easier impact analysis and dependency tracking.
  • Enables AI/ML applications by combining semantic search with structured knowledge representations.

Optional Enhancements:

  • Incremental updates of the knowledge graph as the underlying data changes.
  • Integration with existing graph query languages like Cypher or GraphQL.

In addition to that, this then would be the first Project that combines Automatic Vectorisation AND knowledge Graphs of Data.

Alternatives Considered

No response

Feature Area

Core (Client/Engine)

Use Case

It would be highly valuable to extend the capabilities of this vector database with automatic knowledge graph generation from codebases and other relational or structured documents. This would allow users to not only perform semantic searches but also understand relationships, dependencies, and hierarchies between entities in their data.

Example API (Optional)

Additional Context

No response

Contribution

  • I am willing to contribute to implementing this feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions