Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,10 @@ This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENS

## Development Guide

> **TODO**: Development guide content to be added
For developers who want to build and run PowerRAG from source code:

- **[Developer Setup Guide](docs/developer-setup-guide.md)**: Complete guide for deploying PowerRAG from source code


---

Expand Down
5 changes: 4 additions & 1 deletion README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,10 @@ PowerRAG 作为独立的后端服务运行:

## 开发手册

> **TODO**: 开发手册内容待补充
对于想要从源码构建和运行 PowerRAG 的开发者:

- **[开发者部署手册](docs/developer-setup-guide-zh.md)**: 从源码部署 PowerRAG 的完整指南


---

Expand Down
24 changes: 14 additions & 10 deletions conf/service_conf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ admin:
host: 0.0.0.0
http_port: 9381
mysql:
name: 'rag_flow'
name: 'powerrag'
user: 'root'
password: 'infini_rag_flow'
host: 'localhost'
port: 5455
password: 'powerrag'
host: '127.0.0.1'
port: 2881
Comment on lines +8 to +12
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config switches the metadata DB to use the root account with a hard-coded password. Even for dev defaults, using a privileged user and committing real credentials makes accidental exposure/misuse more likely; prefer a dedicated least-privilege DB user and/or read the password from environment variables (keeping this file as a template/example).

Copilot uses AI. Check for mistakes.
max_connections: 900
stale_timeout: 300
max_allowed_packet: 1073741824
Expand All @@ -31,12 +31,12 @@ infinity:
uri: 'localhost:23817'
db_name: 'default_db'
oceanbase:
scheme: 'oceanbase' # set 'mysql' to create connection using mysql config
scheme: 'mysql'
config:
db_name: 'test'
user: 'root@ragflow'
password: 'infini_rag_flow'
host: 'localhost'
db_name: 'powerrag_doc'
user: 'root'
password: 'powerrag'
host: '127.0.0.1'
port: 2881
Comment on lines +37 to 40
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With scheme: 'mysql', the OceanBase connection code reads host/port/user/password from the top-level mysql section, not from oceanbase.config (only db_name is used). Keeping duplicate credentials here is misleading and can drift; either remove the unused fields from oceanbase.config or switch scheme to the custom mode so oceanbase.config is authoritative.

Suggested change
user: 'root'
password: 'powerrag'
host: '127.0.0.1'
port: 2881

Copilot uses AI. Check for mistakes.
redis:
db: 1
Expand All @@ -53,8 +53,12 @@ redis:
# # 如果不配置此项,将自动降级到本地 mineru CLI(需要通过 pip install -U 'mineru[core]' 安装)
# dots_ocr:
# vllm_url: 'http://localhost:8020'
opendal:
scheme: 'mysql'
config:
oss_table: 'opendal_storage'
task_executor:
message_queue_type: 'redis'
message_queue_type: 'oceanbase'
user_default_llm:
default_models:
embedding_model:
Expand Down
57 changes: 53 additions & 4 deletions docker/launch_backend_service.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,25 @@ load_env_file() {
# Load environment variables
load_env_file

# Get the project root directory (parent of docker directory)
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
cd "$PROJECT_ROOT"

# Unset HTTP proxies that might be set by Docker daemon
export http_proxy=""; export https_proxy=""; export no_proxy=""; export HTTP_PROXY=""; export HTTPS_PROXY=""; export NO_PROXY=""
export PYTHONPATH=$(pwd)
export PYTHONPATH="$PROJECT_ROOT"

export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/
JEMALLOC_PATH=$(pkg-config --variable=libdir jemalloc)/libjemalloc.so

# Try to find jemalloc, but don't fail if not found
JEMALLOC_PATH=""
if command -v pkg-config >/dev/null 2>&1; then
JEMALLOC_LIBDIR=$(pkg-config --variable=libdir jemalloc 2>/dev/null || echo "")
if [ -n "$JEMALLOC_LIBDIR" ] && [ -f "$JEMALLOC_LIBDIR/libjemalloc.so" ]; then
JEMALLOC_PATH="$JEMALLOC_LIBDIR/libjemalloc.so"
fi
fi

PY=python3

Expand All @@ -48,7 +61,12 @@ STOP=false
PIDS=()

# Set the path to the NLTK data directory
export NLTK_DATA="./nltk_data"
export NLTK_DATA="$PROJECT_ROOT/nltk_data"

# Set additional environment variables for OceanBase
export DOC_ENGINE=${DOC_ENGINE:-oceanbase}
export STORAGE_IMPL=${STORAGE_IMPL:-OPENDAL}
export CACHE_TYPE=${CACHE_TYPE:-oceanbase}

# Function to handle termination signals
cleanup() {
Expand All @@ -73,7 +91,11 @@ task_exe(){
local retry_count=0
while ! $STOP && [ $retry_count -lt $MAX_RETRIES ]; do
echo "Starting task_executor.py for task $task_id (Attempt $((retry_count+1)))"
LD_PRELOAD=$JEMALLOC_PATH $PY rag/svr/task_executor.py "$task_id"
if [ -n "$JEMALLOC_PATH" ]; then
LD_PRELOAD=$JEMALLOC_PATH $PY rag/svr/task_executor.py "$task_id"
else
$PY rag/svr/task_executor.py "$task_id"
fi
Comment on lines +94 to +98
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this script runs with set -e, a non-zero exit from task_executor.py will terminate the whole script immediately, so the retry loop won’t actually retry. Run the python command in a context that suppresses -e (e.g., temporarily set +e, or append || true and capture $?) so failures can be handled by the retry logic.

Copilot uses AI. Check for mistakes.
EXIT_CODE=$?
if [ $EXIT_CODE -eq 0 ]; then
echo "task_executor.py for task $task_id exited successfully."
Expand Down Expand Up @@ -114,6 +136,29 @@ run_server(){
fi
}

# Function to execute sync_data_source with retry logic
run_sync_data_source(){
local retry_count=0
while ! $STOP && [ $retry_count -lt $MAX_RETRIES ]; do
echo "Starting sync_data_source.py (Attempt $((retry_count+1)))"
$PY rag/svr/sync_data_source.py
EXIT_CODE=$?
if [ $EXIT_CODE -eq 0 ]; then
Comment on lines +142 to +146
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This retry loop won’t work as intended under set -e: if sync_data_source.py exits non-zero, the script will exit before EXIT_CODE=$? is reached. Wrap the python invocation to avoid set -e aborting (temporarily disable -e or use cmd; EXIT_CODE=$? with set +e/set -e around it).

Copilot uses AI. Check for mistakes.
echo "sync_data_source.py exited successfully."
break
else
echo "sync_data_source.py failed with exit code $EXIT_CODE. Retrying..." >&2
retry_count=$((retry_count + 1))
sleep 2
fi
done

if [ $retry_count -ge $MAX_RETRIES ]; then
echo "sync_data_source.py failed after $MAX_RETRIES attempts. Exiting..." >&2
cleanup
fi
}

# Start task executors
for ((i=0;i<WS;i++))
do
Expand All @@ -125,5 +170,9 @@ done
run_server &
PIDS+=($!)

# Start data source sync service (required for AliDing KB, Notion, Confluence, etc.)
run_sync_data_source &
PIDS+=($!)

# Wait for all background processes to finish
wait
Loading
Loading