EvoRAG: Evolving KG-based RAG with Human Feedback-driven Backpropagation

EvoRAG system overview

Knowledge Graph-based Retrieval-Augmented Generation (KGRAG) has emerged as a promising paradigm for enhancing LLM reasoning by retrieving multi-hop paths from knowledge graphs. However, existing KG-RAG frameworks often underperform in real-world scenarios because the pre-captured knowledge dependencies are not tailored to the downstream generation task or its evolving requirements. These frameworks struggle to adapt to user intent and lack mechanisms to filter low-contribution knowledge during generation. We observe that human feedback on generated responses offers effective supervision for improving KG quality, as it directly reflects user expectations and provides insights into the correctness and usefulness of the output. However, a key challenge lies in effectively linking response-level feedback to triplet-level updates in the knowledge graph.

In this work, we propose EvoRAG, a self-evolving KG-RAG framework that leverages human feedback to continuously refine the KG and enhance reasoning accuracy. EvoRAG introduces a feedbackdriven backpropagation mechanism that attributes feedback to retrieved paths by measuring their utility for response and propagates this utility back to individual triplets, supporting fine-grained KG refinements towards more adaptive and accurate reasoning. Through EvoRAG, we establish a closed loop that couples human, LLM, and graph data, continuously enhancing the performance and robustness in real-world scenarios. Experimental results show that EvoRAG improves reasoning accuracy by 7.34% over state-of-theart KG-RAG frameworks.

Project Structure

requirements.txt Python dependencies
run_batch.sh Launch script
chat/ LLM prompts (see chat_graphrag.py)
config/ Configuration files
database/ Persistent storage
dataset/ Raw datasets
KGModify/ Core graph-modification logic
llmragenv/ LLM interface layer
logs/ Runtime logs
utils/ Shared utilities
kg_modify.py Entry point

🔨Setup

# Create conda environment: python >= 3.10
conda create --name llmrag python=3.10.14 -y

conda activate llmrag

# Install required Python packages:
pip install -r requirements.txt

📦 Deploy Graph Database

NebulaGraph Installation Guide Step 1: Install docker-compose Ensure that you have docker-compose installed. If not, you can install it with the following command:

sudo apt install docker-compose

Step 2: Clone NebulaGraph Docker Compose Repository In a directory of your choice, clone the NebulaGraph Docker Compose files:

git clone https://github.com/vesoft-inc/nebula-docker-compose.git
cd nebula-docker-compose

Step 3: Start NebulaGraph In the nebula-docker-compose directory, run the following command to start NebulaGraph:

docker-compose up -d

Step 4: Check NebulaGraph Container Status After starting, you can verify that the NebulaGraph container is running by using:

docker ps

Step 5: Connect to NebulaGraph To connect to NebulaGraph inside the container, use the following command:

nebula-console -u <user> -p <password> --address=graphd --port=9669
#Replace <user> and <password> with the actual username and password. Ensure that port 9669 is used for the default configuration.

Step 6: Enable Data Persistence To ensure that data persists even after the container is restarted, you can mount persistent volumes. Either modify the volumes section in the docker-compose.yaml file, or manually run the following command with specified persistence paths:

docker run -d --name nebula-graph \
    -v /yourpath/nebula/data:/data \
    -v /yourpath/nebula/logs:/logs \
    -p 9669:9669 \
    vesoft/nebula-graphd:v2.5.0
#Replace /yourpath/nebula with your actual data persistence path.

Neo4j (Installation optional for now)

⚙️ Configuration

Before running the system, you need to specify the paths for the local LLM model and the cached knowledge graph embeddings in the config/path-local.yaml file.

# Example path-local.yaml
local_embedding_path: "/your/path/to/entity_embedding.npz"
LLM:
  Qwen2.5-32B-Instruct:
    template_format: Qwen2.5
    modelpath: "/your/path/to/llm_model"

Algorithm and batch settings

You can configure algorithm parameters such as algorithm type, batch_size, etc., in config/algorithm.yaml:

# Example algorithm.yaml
algorithm: "standard_batch"
batch_size: 16

Important:

Before running the startup script (run.sh), ensure that the ALGORITHM variable in the script matches the algorithm field in config/algorithm.yaml.

The startup script can also override additional parameters such as the specific LLM model to use, number of iterations, and other runtime settings.

💄 Run

Start everything with one command:

bash run_batch.sh

Feedback with Noise

bash run_noise.sh

This script runs the feedback phase with controlled noise injection, used to test the robustness of the feedback mechanism. Noise can simulate uncertain or ambiguous user evaluations, helping assess the model’s stability and adaptive capability.

Entity Count Variation in Retrieval

bash run_case_entity.sh

This script experiments with different numbers of retrieved entities during the graph retrieval process. It helps evaluate how retrieval breadth (i.e., entity expansion) affects final RAG performance and reasoning quality.

Edge Count Variation in Retrieval (Pruning Analysis)

bash run_case_pruning.sh

This script tests edge-level pruning strategies by varying the number of graph edges used during retrieval. It can be used to analyze how relational sparsity or pruning thresholds influence semantic path selection and final results.

EvoRAG Workflow Diagram

The EvoRAG workflow, illustrated above, iteratively refines the knowledge graph by incorporating human feedback at each question-answering cycle, thereby continuously boosting the overall performance of the RAG system.

Notion

.env file loading is deprecated. Now uses client input, including LLM name
The method low_chat() in ./llmragenv/llmrag_env.py is a simplified input version where the LLM name, database usage, etc., are hardcoded. The web_chat method is the full version.
LLM support: The llm_provider dictionary in llm_factory lists all currently supported local models. (Commercial model API keys are not enabled here due to cost, but users can purchase them separately and configure in ./config/config-local.yaml.)
Frontend ports and database configurations can be modified in ./config/config-local.yaml (vector DB and NebulaGraph are hardcoded in the code, and need refactoring)
Code structure:

Code structure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EvoRAG: Evolving KG-based RAG with Human Feedback-driven Backpropagation

Project Structure

🔨Setup

⚙️ Configuration

💄 Run

Notion

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
KGModify		KGModify
chat		chat
config		config
data		data
database		database
dataset		dataset
llmragenv		llmragenv
logs		logs
resource		resource
templates		templates
utils		utils
.gitignore		.gitignore
README.md		README.md
backend_chat.py		backend_chat.py
graph.py		graph.py
kg_modify.py		kg_modify.py
logger.py		logger.py
requirements.txt		requirements.txt
run_batch.sh		run_batch.sh
run_case_entity.sh		run_case_entity.sh
run_case_pruning.sh		run_case_pruning.sh
run_noise.sh		run_noise.sh
webui_chat.py		webui_chat.py

Folders and files

Latest commit

History

Repository files navigation

EvoRAG: Evolving KG-based RAG with Human Feedback-driven Backpropagation

Project Structure

🔨Setup

⚙️ Configuration

💄 Run

Notion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages