-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Problem Statement
We can use semantic search to find Code parts or parts of documents. But we are Missing understanding of the Relations between those. For example: "What does break when we remove xy from the Code?"
Proposed Solution
Proposed Functionality:
Codebase Analysis:
- Parse source code (e.g., functions, classes, modules) and generate a knowledge graph representing relationships such as calls, inheritance, and module dependencies.
- Support multiple programming languages.
Relational/Structured Documents:
- Automatically extract entities and their relations from documents like CSVs, JSON, YAML, or database schemas.
- Represent these relationships in a graph that can be queried semantically.
Integration with Vector Embeddings:
- Map nodes and edges to vector embeddings for enhanced search and reasoning.
- Allow users to perform queries like “find all functions called by X” or “show related entities to Y in dataset Z.”
- Support graph-based querying in addition to vector similarity searches.
Benefits:
- Improved understanding of large codebases or datasets.
- Easier impact analysis and dependency tracking.
- Enables AI/ML applications by combining semantic search with structured knowledge representations.
Optional Enhancements:
- Incremental updates of the knowledge graph as the underlying data changes.
- Integration with existing graph query languages like Cypher or GraphQL.
In addition to that, this then would be the first Project that combines Automatic Vectorisation AND knowledge Graphs of Data.
Alternatives Considered
No response
Feature Area
Core (Client/Engine)
Use Case
It would be highly valuable to extend the capabilities of this vector database with automatic knowledge graph generation from codebases and other relational or structured documents. This would allow users to not only perform semantic searches but also understand relationships, dependencies, and hierarchies between entities in their data.
Example API (Optional)
Additional Context
No response
Contribution
- I am willing to contribute to implementing this feature
Metadata
Metadata
Assignees
Labels
Type
Projects
Status