A Retrieval-Augmented Generation (RAG) API built with NestJS that enables document upload, text extraction, vector embedding storage, and intelligent Q&A powered by OpenAI.
- 📄 Document Upload – Upload PDF, TXT, and Markdown files
- 🔍 Text Extraction – Automatic text extraction from PDFs using
pdf-parse - 🧠 Vector Embeddings – Generate embeddings via OpenAI's embedding models
- 💾 Vector Storage – Store and query embeddings using ChromaDB
- 🤖 RAG Query – Ask questions and get answers based on your documents
- 🐘 PostgreSQL – Document metadata persistence with TypeORM
- 🔐 JWT Authentication – Secure API endpoints with JWT-based authentication
- 🔄 Database Migrations – Production-ready TypeORM migrations
| Category | Technology |
|---|---|
| Framework | NestJS v11 |
| Language | TypeScript |
| Database | PostgreSQL 16 |
| Vector Store | ChromaDB |
| LLM Provider | OpenAI (GPT-4o-mini, text-embedding-3-small) |
| PDF Parsing | pdf-parse v2 |
| Package Manager | pnpm |
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Client/API │────▶│ NestJS App │────▶│ PostgreSQL │
└─────────────────┘ └────────┬────────┘ │ (doc metadata) │
│ └─────────────────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ OpenAI │ │ ChromaDB │ │ Storage │
│Embeddings│ │ (vectors)│ │ (files) │
└──────────┘ └──────────┘ └──────────┘
- Node.js 20+
- pnpm
- Docker & Docker Compose
- OpenAI API Key
git clone <repository-url>
cd rag-doc-app
pnpm installCreate a .env file in the project root
or run this command below:
cp .env.example .envand update .env conent as shown below:
# PostgreSQL
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=rag
POSTGRES_USER=rag
POSTGRES_PASSWORD=rag_password_change_me
# Database Options
DB_MIGRATIONS_RUN=true
DB_LOGGING=true
# ChromaDB
CHROMA_HOST=localhost
CHROMA_PORT=8000
CHROMA_COLLECTION=default_kb
# OpenAI
OPENAI_API_KEY=sk-your-api-key-here
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_CHAT_MODEL=gpt-4o-mini
# JWT Authentication
JWT_SECRET=your_super_secret_jwt_key_change_me_in_production
JWT_EXPIRES_IN=7d
# RAG Settings
RAG_TOP_K=5
RAG_MAX_CONTEXT_CHARS=12000docker compose up -dThis starts:
- PostgreSQL on port
5432 - ChromaDB on port
8000
# Development (watch mode)
pnpm run start:dev
# Production
pnpm run build
pnpm run start:prodThe API will be available at http://localhost:3000.
| Method | Endpoint | Description | Auth Required |
|---|---|---|---|
POST |
/auth/register |
Register a new user | No |
POST |
/auth/login |
Login and get JWT token | No |
GET |
/auth/me |
Get current user profile | Yes |
| Method | Endpoint | Description | Auth Required |
|---|---|---|---|
GET |
/ |
Hello endpoint | No |
GET |
/health |
Health check endpoint | No |
| Method | Endpoint | Description | Auth Required |
|---|---|---|---|
POST |
/documents/upload |
Upload a document (multipart/form-data with file field) |
Yes |
GET |
/documents/:id |
Get document metadata by ID | Yes |
GET |
/documents/:id/download |
Download the original file | Yes |
| Method | Endpoint | Description | Auth Required |
|---|---|---|---|
POST |
/documents/:id/ingest |
Process document: extract text, chunk, embed, and store in vector DB | Yes |
| Method | Endpoint | Description | Auth Required |
|---|---|---|---|
POST |
/rag/query |
Ask a question against your knowledge base | Yes |
curl -X POST http://localhost:3000/auth/register \
-H "Content-Type: application/json" \
-d '{
"email": "user@example.com",
"password": "your_secure_password"
}'Response:
{
"message": "User registered successfully",
"accessToken": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"user": {
"id": "a1b2c3d4-...",
"email": "user@example.com"
}
}curl -X POST http://localhost:3000/auth/login \
-H "Content-Type: application/json" \
-d '{
"email": "user@example.com",
"password": "your_secure_password"
}'Response:
{
"accessToken": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"user": {
"id": "a1b2c3d4-...",
"email": "user@example.com"
}
}curl -X POST http://localhost:3000/documents/upload \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-F "file=@/path/to/document.pdf"Response:
{
"docId": "b61d7d5b-1485-4ea0-9c3d-442a9ca5d69d",
"originalName": "document.pdf",
"storedName": "document-1735123456789-123456789.pdf",
"mimeType": "application/pdf",
"size": 102400,
"storagePath": "/path/to/storage/uploads/document-xxx.pdf",
"status": "uploaded",
"createdAt": "2025-12-25T10:00:00.000Z"
}curl -X POST http://localhost:3000/documents/b61d7d5b-1485-4ea0-9c3d-442a9ca5d69d/ingest \
-H "Authorization: Bearer YOUR_JWT_TOKEN"Response:
{
"docId": "b61d7d5b-1485-4ea0-9c3d-442a9ca5d69d",
"status": "ingested",
"chunks": 42,
"embeddingModel": "text-embedding-3-small"
}curl -X POST http://localhost:3000/rag/query \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"query": "What is the main topic of the document?",
"topK": 5
}'Response:
{
"answer": "The main topic of the document is...",
"sources": [
{ "docId": "b61d7d5b-...", "source": "document.pdf", "chunkIndex": 3 }
],
"debug": {
"topK": 5,
"embeddingModel": "text-embedding-3-small",
"chatModel": "gpt-4o-mini",
"matched": 5
}
}curl -X POST http://localhost:3000/rag/query \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"query": "Summarize the key points",
"docId": "b61d7d5b-1485-4ea0-9c3d-442a9ca5d69d"
}'src/
├── main.ts # Application entry point
├── app.module.ts # Root module
├── documents/ # Document upload & metadata management
│ ├── document.entity.ts # TypeORM entity
│ ├── documents.controller.ts
│ ├── documents.service.ts
│ └── documents.module.ts
├── ingestion/ # Text extraction & chunking
│ ├── ingestion.controller.ts
│ ├── ingestion.service.ts
│ └── ingestion.module.ts
├── embeddings/ # OpenAI embeddings generation
│ ├── embeddings.service.ts
│ └── embeddings.module.ts
├── vector-store/ # ChromaDB integration
│ ├── chroma.store.ts
│ └── vector-store.module.ts
├── llm/ # OpenAI chat completion
│ ├── llm.service.ts
│ └── llm.module.ts
├── rag/ # RAG query orchestration
├── rag.controller.ts
├── rag.service.ts
├── rag.module.ts
└── dto/
└── rag-query.dto.ts
└── auth/ # JWT authentication
├── auth.controller.ts
├── auth.service.ts
├── auth.module.ts
├── users.service.ts
├── entities/
│ └── user.entity.ts
├── dto/
│ ├── login.dto.ts
│ └── register.dto.ts
├── guards/
│ └── jwt-auth.guard.ts
├── strategies/
│ └── jwt.strategy.ts
└── decorators/
├── public.decorator.ts
└── current-user.decorator.ts
└── database/ # Database migrations
├── data-source.ts # TypeORM CLI config
└── migrations/ # Migration files
pnpm run start:dev # Start in watch mode
pnpm run build # Build for production
pnpm run start:prod # Run production build
pnpm run lint # Run ESLint
pnpm run format # Format code with Prettier
pnpm run test # Run unit tests
pnpm run test:e2e # Run end-to-end tests
pnpm run test:cov # Run tests with coverage
# Database Migrations
pnpm run migration:run # Run pending migrations
pnpm run migration:revert # Revert last migration
pnpm run migration:show # Show migration status
pnpm run migration:generate src/database/migrations/Name # Auto-generate from entity changes
pnpm run migration:create src/database/migrations/Name # Create empty migrationThe application uses TypeORM migrations for database schema management. Migrations run automatically on startup when DB_MIGRATIONS_RUN=true (default).
- Modify your entity (e.g., add a new column)
- Generate migration:
pnpm run migration:generate src/database/migrations/AddNewColumn - Review the generated file in
src/database/migrations/ - Commit the migration file
Migrations run automatically on app startup. For manual control:
# Build first (required for migrations)
pnpm run build
# Run migrations manually
pnpm run migration:run
# Then start the app with migrations disabled
DB_MIGRATIONS_RUN=false pnpm run start:prod| Type | Extensions | MIME Types |
|---|---|---|
.pdf |
application/pdf |
|
| Plain Text | .txt |
text/plain |
| Markdown | .md |
text/markdown |
Text is split into overlapping chunks for better context retrieval:
- Chunk Size: 1200 characters (default)
- Overlap: 200 characters (default)
| Variable | Default | Description |
|---|---|---|
RAG_TOP_K |
5 | Number of similar chunks to retrieve |
RAG_MAX_CONTEXT_CHARS |
12000 | Maximum context length for LLM |
You can adjust
tempeture parameter and system message prompt under src/llm/llm.service.ts to adjust accuracy of answers and context scope.
UNLICENSED - Private project