ℹ️ last updated April 21, 2025.
This is a work in progress repository where I collect my experiments on building software for RAG applications based on Knowledge Graphs.
What is currently provided is:
- source code
- example notebooks written while building the repo
- a Streamlit app to showcase work done until this point
- Docker files to built the image(s) for this project without having to manually install Neo4j and Ollama
A Knowledge Graph is a structured representation of information that connects concepts, entities, and their relationships in a way that mimics human understanding. It is often used to organize and integrate data from various sources, enabling machines to reason, infer, and retrieve relevant information more effectively.
💡 To have a clearer insight on the relevance of Knowledge Graphs, I've written a Medium Post. Feel free to check it out.
The key features of Knowledge Graphs can be divided in:
- Entities (Nodes): represent real-world objects like people, places, organizations, or abstract concepts;
- Relationships (Edges): define how entities are connected between them (i.e: “Bill → WORKS_AT → Microsoft”);
- Attributes (Properties): provide additional details about entities (e.g., Microsoft’s founding year, revenue, or location) or relationships ( i.e. “Bill → FRIENDS_WITH {since: 2021} → Mark”);
- Ontology (Schema): defines the structure and rules of the graph, ensuring consistency across the represented knowledge.
RAG applications usually works pretty well for use cases where each piece of knowledge is self-enclosed inside a single piece of text into a single document. However, vector search and hybrid search fall short in at least one regard: they do not account for relationships. This is where Graphs come into play.
Vector similarity alone relies on explicit mentions in the Knowledge Base (intra-document level), while representing Knowledge as graphs enables reasoning at a global dataset level (inter-document level). Combining the two approaches will result in a more cohesive and grounded retrieval process, that is becoming known as GraphRAG.
In order to showcase this approach to RAG, we will need some tools. If you are using the Dockerized version of this app, some of them are already set up for you in the DockerFile.
- Neo4j: in this demo app, Neo4j is used both as a Vector Store as well as a Graph Database; in fact, during the ingestion process, each Document is transformed in a node, and from it
Chunknodes are extracted (withembeddingsas metadata for that node), while an agent is used to produce a graph representation of the content of the document. - Ollama/OpenAI/Groq API: To power agents you will need LLMs and Embeddings models. For this demo, available options are to either provide OpenAI / Azure OpenAI API Key and endpoints or to have Ollama installed in your machine. You could also use the Groq Cloud API if you want to test other open weights models
- Documents: Documents coming from a specific domain to ingest; available formats are
.pdf,.docx,.txt,.html. - Configuration: in order for this Demo to work, you should either have all the settings for Neo4j, LLMs.. inside your environment or a configuration file at the following path:
knowledge-graphs/config.env.
the configuration file should look something like this:
NEO4J_URI=neo4j+s://<your_instance_id>.databases.neo4j.io
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password
AURA_INSTANCEID=your_instance_id
AURA_INSTANCENAME=Instance01
INDEX_NAME=vector
TIMEOUT=5000
CHUNKER_TYPE=recursive
CHUNKER_CHUNK_SIZE=1000
CHUNKER_CHUNK_OVERLAP=100
EMBEDDINGS_TYPE=ollama
EMBEDDINGS_MODEL_NAME=mxbai-embed-large
EMBEDDINGS_API_KEY=none
EMBEDDINGS_DEPLOYMENT=none
EMBEDDINGS_ENDPOINT=none
QA_MODEL_TYPE=groq
QA_MODEL_NAME=gemma2-9b-it
QA_MODEL_TEMPERATURE=0.0
QA_API_KEY=XXXX
QA_MODEL_DEPLOYMENT=none
QA_MODEL_ENDPOINT=none
Neo4j is an open-source graph database with vector search capabilities. In this project, it is used as a backbone for our Knowledge Graph, where each Document is stored as a node,
connected to nodes representing its Chunks. It is also used to store nodes and relationships, connected to their original's Chunk.
Neo4j offers both a managed, cloud-based experience (with a free tier) as well as a desktop version.
This is what a Knowledge Graph looks like in Neo4j:

To showcase how the code works, a Streamlit App has been built with the following pages:
- Home: mostly here to help the user navigate the web app;
- Upload: Upload documents into Neo4j following a pipeline of commands;
- Chat: Chat with the Knowledge Graph with various retrieval strategies (see the previous section).
💡 before running the app, ensure you have
- an active instance of Neo4j (either in the cloud with Aura or locally deployed)
- API keys to access LLMs and Embeddings models (OpenAI, Ollama, Groq..)
- a
config.envfile with your environment variables at the root of this folder
To run the app, all you need to do is to go on your terminal and run
pip install -r requirements.txt
streamlit run app.py
A new webpage pointing to your localhost will appear and you will be able to test the app for yourself.
⚠ As all Streamlit apps, this should NOT be used for production use cases but it's only meant as a demo.
Docker is commonly used to package and containerize applications to make them ready for cloud deployment via container registries. It also helps avoiding those "works on my machine" kinda issues.
In this repo you will find files for building the images you need for the demo, without having to install anything yourself (except for Docker of course).
If you want to build the app "as-is" you can just go to your terminal and launch
docker compose up --build
to build the following images (each of them can also be run on its own):
Dockerfile: builds the app image and lets you run it in a isolated environment (no need to have python installed in your machine)ollama.Dockerfile: use it to containerize a Ollama version; currently work in progress, ollama is set to only use CPU inferenceNeo4j: the base image for Neo4j + additional libraries such asAPOCandgraph-data-science
⚠ Neo4j Community Edition does not support vector search natively as of now.
If you want to experiment with Neo4j Enterprise locally, Neo4j still allows you to run it free for dev purposes (you just have to accept the license, as you can see from thedocker-compose.ymlfile).
If the command does not throw errors, you should be able to see something like this

Currently, when uploading a one or more files inside the Streamlit App (see below), each file is passed through a pipeline that will:
- load it into a json format;
- cleaning its text;
- divide its text into smaller pieces, called chunks;
- embed each chunk into its vector representation;
- use a LLM model to extract a graph of concepts from each chunk;
- upload the obtained vectors and entities into the Knowledge Graph;
- update the centralities measures and the division of the Graph into communities.
In the near future, the plan is to integrate additional (optional) steps, such as one for entity resolution and one for link prediction between entities.
Step #5 is probably the less obvious one. It is performed using an agent called GraphExtractor that will output a structured output mimicking a pydantic class:
class _Node(Serializable):
id: str
type: str
properties: Optional[Dict[str, str]] = None
class _Relationship(Serializable):
source: str
target: str
type: str
properties: Optional[Dict[str, str]] = None
class _Graph(Serializable):
"""
Represents a graph consisting of nodes and relationships.
-----------
Attributes:
-----------
`nodes (List[_Node])`: A list of nodes in the graph.
`relationships (List[_Relationship])`: A list of relationships in the graph.
"""
nodes: List[_Node]
relationships: List[_Relationship]
When extracting a Knowledge Graph from documents chunks, it might make sense to give the GraphExtractor in charge of this task an Ontology in the form of a pydantic class:
class Ontology(BaseModel):
allowed_labels: Optional[List[str]]=None
labels_descriptions: Optional[Dict[str, str]]=None
allowed_relations: Optional[List[str]]=None
Since ontologies are by definition domain-dependent, what happens when the user is not a SME or a domain expert?
My suggestion is to use another Agent called OntologyExplorer to infer the ontology of the domain from a subset of chunks; the output of this agent will be one of the inputs for the GraphExtractor.
Once documents are uploaded inside the Knowledge Graph, the user can query it using the GraphAgentResponder. Under the hood, we have a basic agent that runs via LLM / Embeddings API calls and has many available strategies to traverse the graph.
The GraphAgentResponder has many strategies at its disposal to query and traverse the Knowledge Graph to answer the user's query.
Here is a comparison table with available options.
| Method | Description | Token Usage | Latency | Params | Performances |
|---|---|---|---|---|---|
answer_with_cypher |
Uses only the Cypher chain to answer the user's question | Medium | Low | intermediate_steps |
Higher the better the schema of the graph is defined |
answer_with_context |
Uses only vanilla RAG to answer the user's question. If use_adjacent_chunks=True will query the graph for additional context compared to the Chunks retrieved by the similarity search |
Low | Low | use_adjacent_chunks |
Depends on the quality of the Chunks and by how self-enclosed is the question |
answer_with_community_reports |
Queries two vector indexes to get the user's answer out of an ensemble of contexts: one made of a list of CommunityReport and one made of a list of Chunk from the same communities of the reports. If use_adjacent_chunks=True will query the graph for additional context compared to the Chunks retrieved by the similarity search |
Medium | Low / Medium | use_adjacent_chunks, community_type |
Enhanced Similarity Search, performances vary on the attention window of the LLM |
answer_with_community_subgraph |
Answers after querying for communities: (i) read the most relevant community reports (ii) fetch Chunks belonging to the most relevant community (iii) follow the MENTIONS relationship of each Chunk (iv) fetch the community subgraph (v) passes the subgraph + Chunks + the report to a reconciler agent to decide how to answer | High | Medium | community_type |
Performances vary on the attention window of the LLM; might get chaotic |
answer |
Answers the user query performing text generation after having retrieved context both via Vector Search and Cypher Queries. Results from both this methods are synthetized in a comprehensive answer | High | High | use_adjacent_chunks, filter |
Generally the best (most on point) answering strategy. Might Get complicated for smaller models to handle the complexity |
This app currently offers various options for LLM and Embeddings deployment; since this is built mostly for fun, I am currently using Ollama and Groq models.
Embeddings:
Large Language Models:



