π― Deploy a complete PDF document analysis solution with Azure AI Foundry + RAG capabilities in just one command!
π Quick Start β’ ποΈ Architecture β’ π Features β’ οΏ½ Cost β’ οΏ½π§ Development
DocGenAI is an enterprise-grade Retrieval Augmented Generation (RAG) solution that transforms how organizations interact with their PDF documents. Upload any PDF document and ask intelligent questions about its content using natural language.
- π€ AI-Powered: Azure OpenAI GPT-4o-mini for intelligent responses
- π PDF Analysis: Advanced document processing and chunking
- π Vector Search: Semantic search with Azure AI Search (1536-dimensional embeddings)
- π Modern UI: React TypeScript with Fluent UI components
- β‘ One-Command Deploy: From zero to production in 10 minutes
- π Enterprise Security: Azure Managed Identity and RBAC
- π Real-time Monitoring: Application Insights and Log Analytics
- π Auto-Scaling: Container Apps with 0-100+ instances based on demand
git clone https://github.com/gitpavleenbali/DocGenAI.git
cd DocGenAI
.\deploy.ps1git clone https://github.com/gitpavleenbali/DocGenAI.git
cd DocGenAI
chmod +x deploy.sh
./deploy.shβ³ Prerequisites Check (1-2 minutes)
π Azure Authentication (1 minute)
ποΈ Infrastructure Setup (5-8 minutes)
π Application Deployment (2-3 minutes)
β
Ready to Use! (Total: ~10 minutes)
What happens during deployment:
- β Checks Azure CLI, Docker, and prerequisites
- β Creates Azure resource group and services
- β Deploys AI models (GPT-4o-mini, text-embedding-3-small)
- β Builds and deploys FastAPI backend + React frontend
- β Configures security with managed identities
- β Sets up monitoring and logging
graph TB
Users[π₯ Users] --> WebApp[π React Web App]
Users --> Teams[π¬ Microsoft Teams]
Teams --> CopilotStudio[π€ Copilot Studio Bot]
WebApp --> ContainerApp[π³ Container Apps API]
CopilotStudio --> ContainerApp
ContainerApp --> AIFoundry[π§ Azure AI Foundry]
ContainerApp --> Storage[πΎ Azure Storage]
ContainerApp --> CosmosDB[ποΈ Cosmos DB]
ContainerApp --> AISearch[π Azure AI Search]
AIFoundry --> OpenAI[π€ Azure OpenAI<br/>GPT-4o-mini<br/>text-embedding-3-small]
subgraph "Monitoring & Security"
AppInsights[π Application Insights]
LogAnalytics[π Log Analytics]
ManagedIdentity[π Managed Identity]
end
ContainerApp --> AppInsights
ContainerApp --> ManagedIdentity
style WebApp fill:#e1f5fe
style ContainerApp fill:#f3e5f5
style AIFoundry fill:#fff3e0
style Storage fill:#e8f5e8
style CosmosDB fill:#fff8e1
style AISearch fill:#f1f8e9
sequenceDiagram
participant User
participant WebApp as React Web App
participant API as FastAPI (Container Apps)
participant Storage as Azure Storage
participant AI as Azure OpenAI
participant Search as Azure AI Search
participant Cosmos as Cosmos DB
Note over User,Cosmos: Document Upload & Processing
User->>WebApp: Upload PDF Document
WebApp->>API: POST /documents/upload
API->>Storage: Store PDF file
API->>API: Extract text + chunk (1000 tokens, 200 overlap)
API->>AI: Generate embeddings (text-embedding-3-small)
AI-->>API: Return 1536-dimensional vectors
API->>Search: Index embeddings with metadata
API->>Cosmos: Store document metadata + chunks
API-->>WebApp: Return document ID + processing status
Note over User,Cosmos: RAG Query Processing
User->>WebApp: Ask question about document
WebApp->>API: POST /chat
API->>AI: Generate query embedding
AI-->>API: Return query vector
API->>Search: Vector similarity search (top-k retrieval)
Search-->>API: Return relevant chunks with scores
API->>AI: Generate answer with context (GPT-4o-mini)
Note right of AI: Prompt includes:<br/>- User question<br/>- Retrieved chunks<br/>- Document metadata
AI-->>API: Return contextual AI response
API-->>WebApp: Return answer + source references
| Layer | Technology | Purpose | Configuration |
|---|---|---|---|
| Frontend | React 18 + TypeScript | Modern web interface | Fluent UI components |
| Backend | FastAPI + Python 3.9 | RAG processing engine | Async processing |
| AI Models | GPT-4o-mini + text-embedding-3-small | Chat + embeddings | Azure OpenAI |
| Vector Store | Azure AI Search | Semantic search | 1536-dimensional vectors |
| Document Store | Azure Blob Storage | PDF file storage | Hot tier |
| Metadata | Cosmos DB | Document metadata | Serverless billing |
| Hosting | Azure Container Apps | Auto-scaling hosting | 1-10 replicas |
| Monitoring | Application Insights | Performance tracking | Real-time metrics |
- Intelligent Q&A: Ask questions about your documents in natural language
- Context-Aware: Responses include relevant document excerpts
- Multi-Document: Query across multiple uploaded documents
- Semantic Search: Find content based on meaning, not just keywords
- PDF Support: Upload and process PDF documents
- Smart Chunking: Intelligent text segmentation for optimal retrieval
- Metadata Extraction: Automatic extraction of document properties
- Vector Embeddings: High-quality text embeddings for search
- Drag & Drop Upload: Intuitive document upload interface
- Real-Time Chat: Instant responses with typing indicators
- Document Management: View and manage uploaded documents
- Responsive Design: Works on desktop, tablet, and mobile
- Azure AD Integration: Enterprise authentication
- Managed Identity: Secure service-to-service communication
- RBAC: Role-based access control
- Audit Logging: Comprehensive activity tracking
- Data Residency: Configure deployment region for compliance
π Estimated Monthly Costs:
βββββββββββββββββββββββββββ¬βββββββββββββββ
β Service β Monthly Cost β
βββββββββββββββββββββββββββΌβββββββββββββββ€
β Container Apps β ~$15 β
β Azure OpenAI β ~$20-50 β
β Azure AI Search (Basic) β ~$250 β
β Cosmos DB (Serverless) β ~$5-15 β
β Blob Storage β ~$5 β
β Application Insights β ~$5 β
βββββββββββββββββββββββββββΌβββββββββββββββ€
β TOTAL β ~$300-340 β
βββββββββββββββββββββββββββ΄βββββββββββββββ
π‘ Scales with usage - pay only for what you use
- AI Search: Largest cost component - consider Basic tier for development
- OpenAI: Token-based pricing - optimize prompts and responses
- Cosmos DB: Serverless model scales to zero when not in use
After successful deployment, test the system:
Use the included test-document.txt file or upload your own PDF.
β "What Azure services are used in DocGenAI?"
β "How does the RAG pipeline work?"
β "What are the main features of this solution?"
β "What is the estimated cost for running this system?"
- β Document upload works
- β AI responses are relevant
- β Vector search returns accurate results
- β All Azure services are running
- Azure Subscription with contributor access
- Docker Desktop - Download here
- Git (automatically installed by script)
cd api
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload --port 8000cd webapp
npm install
npm startDocGenAI/
βββ π infra/ # Bicep infrastructure templates
β βββ main.bicep # Main deployment template
β βββ modules/ # Service-specific modules
βββ π api/ # FastAPI backend
β βββ main.py # Application entry point
β βββ routers/ # API route handlers
β βββ services/ # Business logic
βββ π webapp/ # React frontend
β βββ src/ # Source code
β βββ public/ # Static assets
β βββ package.json # Dependencies
βββ π scripts/ # Deployment utilities
βββ π docs/ # Additional documentation
βββ deploy.ps1 # Windows deployment script
βββ deploy.sh # Linux/Mac deployment script
βββ azure.yaml # Azure Developer CLI config
.\deploy.ps1 -EnvironmentName "my-production-env".\deploy.ps1 -SkipPrerequisitesThe system automatically configures these environment variables:
AZURE_OPENAI_ENDPOINTAZURE_OPENAI_KEYAZURE_SEARCH_ENDPOINTAZURE_SEARCH_KEYAZURE_STORAGE_ACCOUNTCOSMOS_DB_ENDPOINT
"Azure CLI not found"
# Install Azure CLI
winget install Microsoft.AzureCLI
# Restart your terminal"azd command not found"
# Install Azure Developer CLI
winget install Microsoft.Azd
# Restart your terminal"Docker not running"
- Start Docker Desktop
- Ensure Docker daemon is running
- Check if containers can be created:
docker run hello-world
"Deployment failed"
# Check detailed logs
azd show
az group list
azd logs
# Common fixes:
# 1. Ensure you have Contributor role on subscription
# 2. Check if resource names are unique
# 3. Verify quota limits for chosen region# Remove all deployed resources
azd down --purge- Check the troubleshooting section above
- Review deployment logs:
azd logs - Check Azure Portal for resource status
- Create a GitHub issue with logs attached
- Performance Metrics: Response times, throughput
- Error Tracking: Exceptions and failed requests
- User Analytics: Usage patterns and behavior
- Dependency Tracking: External service calls
- API Health:
GET /health - Database Connectivity:
GET /health/db - AI Services Status:
GET /health/ai
We welcome contributions! Here's how to get started:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Test your changes with the deployment script
- Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
- Follow existing code style and conventions
- Add tests for new features
- Update documentation as needed
- Ensure deployment script still works
This project is licensed under the MIT License - see the LICENSE file for details.
Modern React interface for document upload and chat
RESTful backend with OpenAPI documentation
Complete monitoring and management portal
Auto-scaling based on demand
Ready to transform your document workflow?
π Start Now | π Learn More | π¬ Get Support
Happy Document Chatting! π€π