This project implements a PDF document processing system with vector embeddings for semantic search capabilities. It consists of a client-server architecture where users can upload PDFs and perform semantic searches through their content.
pdf-rag/
├── client/ # Next.js frontend
├── server/ # Node.js backend
└── docker-compose.yml # Docker configuration
- Node.js (v18 or higher)
- Docker and Docker Compose
- pnpm (Package Manager)
- Clone the repository:
git clone <your-repo-url>
cd pdf-rag- Install dependencies:
# Install server dependencies
cd server
pnpm install
# Install client dependencies
cd ../client
pnpm install-
Set up environment variables:
- Copy
.env.exampleto.envin both client and server directories - Fill in the required environment variables
- Copy
-
Start the services:
# Start Docker containers (Redis and Qdrant)
docker-compose up -d
# Start the server
cd server
pnpm dev
# Start the client (in a new terminal)
cd client
pnpm dev- PDF document upload
- Vector embeddings generation using Google AI
- Document storage in Qdrant vector database
- Semantic search capabilities
- Real-time processing with Redis queue
- Frontend: Next.js
- Backend: Node.js
- Vector Database: Qdrant
- Queue: BullMQ with Redis
- Embeddings: Google AI
- Container: Docker