Skip to content

gitpavleenbali/DocGenAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DocGenAI - Enterprise RAG Solution πŸš€

DocGenAI Banner RAG One Command Production Ready

🎯 Deploy a complete PDF document analysis solution with Azure AI Foundry + RAG capabilities in just one command!

πŸš€ Quick Start β€’ πŸ—οΈ Architecture β€’ πŸ“‹ Features β€’ οΏ½ Cost β€’ οΏ½πŸ”§ Development


🎯 What is DocGenAI?

DocGenAI is an enterprise-grade Retrieval Augmented Generation (RAG) solution that transforms how organizations interact with their PDF documents. Upload any PDF document and ask intelligent questions about its content using natural language.

✨ Key Highlights

  • πŸ€– AI-Powered: Azure OpenAI GPT-4o-mini for intelligent responses
  • πŸ“„ PDF Analysis: Advanced document processing and chunking
  • πŸ” Vector Search: Semantic search with Azure AI Search (1536-dimensional embeddings)
  • 🌐 Modern UI: React TypeScript with Fluent UI components
  • ⚑ One-Command Deploy: From zero to production in 10 minutes
  • πŸ” Enterprise Security: Azure Managed Identity and RBAC
  • πŸ“Š Real-time Monitoring: Application Insights and Log Analytics
  • πŸ”„ Auto-Scaling: Container Apps with 0-100+ instances based on demand

πŸš€ Quick Start

Option 1: Windows PowerShell (Recommended)

git clone https://github.com/gitpavleenbali/DocGenAI.git
cd DocGenAI
.\deploy.ps1

Option 2: Mac/Linux Bash

git clone https://github.com/gitpavleenbali/DocGenAI.git
cd DocGenAI
chmod +x deploy.sh
./deploy.sh

⏱️ Deployment Timeline

⏳ Prerequisites Check     (1-2 minutes)
πŸ” Azure Authentication   (1 minute)  
πŸ—οΈ Infrastructure Setup   (5-8 minutes)
πŸš€ Application Deployment (2-3 minutes)
βœ… Ready to Use!          (Total: ~10 minutes)

What happens during deployment:

  1. βœ… Checks Azure CLI, Docker, and prerequisites
  2. βœ… Creates Azure resource group and services
  3. βœ… Deploys AI models (GPT-4o-mini, text-embedding-3-small)
  4. βœ… Builds and deploys FastAPI backend + React frontend
  5. βœ… Configures security with managed identities
  6. βœ… Sets up monitoring and logging

πŸ—οΈ Architecture

🎨 System Overview

graph TB
    Users[πŸ‘₯ Users] --> WebApp[🌐 React Web App]
    Users --> Teams[πŸ’¬ Microsoft Teams]
    
    Teams --> CopilotStudio[πŸ€– Copilot Studio Bot]
    
    WebApp --> ContainerApp[🐳 Container Apps API]
    CopilotStudio --> ContainerApp
    
    ContainerApp --> AIFoundry[🧠 Azure AI Foundry]
    ContainerApp --> Storage[πŸ’Ύ Azure Storage]
    ContainerApp --> CosmosDB[πŸ—„οΈ Cosmos DB]
    ContainerApp --> AISearch[πŸ” Azure AI Search]
    
    AIFoundry --> OpenAI[πŸ€– Azure OpenAI<br/>GPT-4o-mini<br/>text-embedding-3-small]
    
    subgraph "Monitoring & Security"
        AppInsights[πŸ“Š Application Insights]
        LogAnalytics[πŸ“ Log Analytics]
        ManagedIdentity[πŸ” Managed Identity]
    end
    
    ContainerApp --> AppInsights
    ContainerApp --> ManagedIdentity
    
    style WebApp fill:#e1f5fe
    style ContainerApp fill:#f3e5f5
    style AIFoundry fill:#fff3e0
    style Storage fill:#e8f5e8
    style CosmosDB fill:#fff8e1
    style AISearch fill:#f1f8e9
Loading

πŸ”„ RAG Data Flow

sequenceDiagram
    participant User
    participant WebApp as React Web App
    participant API as FastAPI (Container Apps)
    participant Storage as Azure Storage
    participant AI as Azure OpenAI
    participant Search as Azure AI Search
    participant Cosmos as Cosmos DB
    
    Note over User,Cosmos: Document Upload & Processing
    User->>WebApp: Upload PDF Document
    WebApp->>API: POST /documents/upload
    API->>Storage: Store PDF file
    API->>API: Extract text + chunk (1000 tokens, 200 overlap)
    API->>AI: Generate embeddings (text-embedding-3-small)
    AI-->>API: Return 1536-dimensional vectors
    API->>Search: Index embeddings with metadata
    API->>Cosmos: Store document metadata + chunks
    API-->>WebApp: Return document ID + processing status
    
    Note over User,Cosmos: RAG Query Processing
    User->>WebApp: Ask question about document
    WebApp->>API: POST /chat
    API->>AI: Generate query embedding
    AI-->>API: Return query vector
    API->>Search: Vector similarity search (top-k retrieval)
    Search-->>API: Return relevant chunks with scores
    API->>AI: Generate answer with context (GPT-4o-mini)
    Note right of AI: Prompt includes:<br/>- User question<br/>- Retrieved chunks<br/>- Document metadata
    AI-->>API: Return contextual AI response
    API-->>WebApp: Return answer + source references
Loading

πŸ› οΈ Technology Stack

Layer Technology Purpose Configuration
Frontend React 18 + TypeScript Modern web interface Fluent UI components
Backend FastAPI + Python 3.9 RAG processing engine Async processing
AI Models GPT-4o-mini + text-embedding-3-small Chat + embeddings Azure OpenAI
Vector Store Azure AI Search Semantic search 1536-dimensional vectors
Document Store Azure Blob Storage PDF file storage Hot tier
Metadata Cosmos DB Document metadata Serverless billing
Hosting Azure Container Apps Auto-scaling hosting 1-10 replicas
Monitoring Application Insights Performance tracking Real-time metrics

πŸ“‹ Features

πŸ€– AI Capabilities

  • Intelligent Q&A: Ask questions about your documents in natural language
  • Context-Aware: Responses include relevant document excerpts
  • Multi-Document: Query across multiple uploaded documents
  • Semantic Search: Find content based on meaning, not just keywords

πŸ“„ Document Processing

  • PDF Support: Upload and process PDF documents
  • Smart Chunking: Intelligent text segmentation for optimal retrieval
  • Metadata Extraction: Automatic extraction of document properties
  • Vector Embeddings: High-quality text embeddings for search

🌐 User Experience

  • Drag & Drop Upload: Intuitive document upload interface
  • Real-Time Chat: Instant responses with typing indicators
  • Document Management: View and manage uploaded documents
  • Responsive Design: Works on desktop, tablet, and mobile

πŸ” Enterprise Features

  • Azure AD Integration: Enterprise authentication
  • Managed Identity: Secure service-to-service communication
  • RBAC: Role-based access control
  • Audit Logging: Comprehensive activity tracking
  • Data Residency: Configure deployment region for compliance

πŸ’° Cost Structure

Development Environment

πŸ“Š Estimated Monthly Costs:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Service                 β”‚ Monthly Cost β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Container Apps          β”‚ ~$15         β”‚
β”‚ Azure OpenAI            β”‚ ~$20-50      β”‚
β”‚ Azure AI Search (Basic) β”‚ ~$250        β”‚
β”‚ Cosmos DB (Serverless)  β”‚ ~$5-15       β”‚
β”‚ Blob Storage            β”‚ ~$5          β”‚
β”‚ Application Insights    β”‚ ~$5          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ TOTAL                   β”‚ ~$300-340    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ’‘ Scales with usage - pay only for what you use

Cost Optimization Tips

  • AI Search: Largest cost component - consider Basic tier for development
  • OpenAI: Token-based pricing - optimize prompts and responses
  • Cosmos DB: Serverless model scales to zero when not in use

πŸ§ͺ Testing Your Deployment

After successful deployment, test the system:

1. Upload Test Document

Use the included test-document.txt file or upload your own PDF.

2. Try Sample Questions

❓ "What Azure services are used in DocGenAI?"
❓ "How does the RAG pipeline work?"
❓ "What are the main features of this solution?"
❓ "What is the estimated cost for running this system?"

3. Verify Components

  • βœ… Document upload works
  • βœ… AI responses are relevant
  • βœ… Vector search returns accurate results
  • βœ… All Azure services are running

πŸ”§ Development

Prerequisites

  • Azure Subscription with contributor access
  • Docker Desktop - Download here
  • Git (automatically installed by script)

Local Development

Backend Development

cd api
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload --port 8000

Frontend Development

cd webapp
npm install
npm start

Project Structure

DocGenAI/
β”œβ”€β”€ πŸ“ infra/              # Bicep infrastructure templates
β”‚   β”œβ”€β”€ main.bicep         # Main deployment template
β”‚   └── modules/           # Service-specific modules
β”œβ”€β”€ πŸ“ api/                # FastAPI backend
β”‚   β”œβ”€β”€ main.py           # Application entry point
β”‚   β”œβ”€β”€ routers/          # API route handlers
β”‚   └── services/         # Business logic
β”œβ”€β”€ πŸ“ webapp/             # React frontend
β”‚   β”œβ”€β”€ src/              # Source code
β”‚   β”œβ”€β”€ public/           # Static assets
β”‚   └── package.json      # Dependencies
β”œβ”€β”€ πŸ“ scripts/            # Deployment utilities
β”œβ”€β”€ πŸ“ docs/               # Additional documentation
β”œβ”€β”€ deploy.ps1             # Windows deployment script
β”œβ”€β”€ deploy.sh              # Linux/Mac deployment script
└── azure.yaml            # Azure Developer CLI config

Advanced Configuration

Custom Environment Name

.\deploy.ps1 -EnvironmentName "my-production-env"

Skip Prerequisites Check

.\deploy.ps1 -SkipPrerequisites

Environment Variables

The system automatically configures these environment variables:

  • AZURE_OPENAI_ENDPOINT
  • AZURE_OPENAI_KEY
  • AZURE_SEARCH_ENDPOINT
  • AZURE_SEARCH_KEY
  • AZURE_STORAGE_ACCOUNT
  • COSMOS_DB_ENDPOINT

🚨 Troubleshooting

Common Issues

"Azure CLI not found"
# Install Azure CLI
winget install Microsoft.AzureCLI
# Restart your terminal
"azd command not found"
# Install Azure Developer CLI
winget install Microsoft.Azd
# Restart your terminal
"Docker not running"
  • Start Docker Desktop
  • Ensure Docker daemon is running
  • Check if containers can be created: docker run hello-world
"Deployment failed"
# Check detailed logs
azd show
az group list
azd logs

# Common fixes:
# 1. Ensure you have Contributor role on subscription
# 2. Check if resource names are unique
# 3. Verify quota limits for chosen region

Resource Cleanup

# Remove all deployed resources
azd down --purge

Get Help

  1. Check the troubleshooting section above
  2. Review deployment logs: azd logs
  3. Check Azure Portal for resource status
  4. Create a GitHub issue with logs attached

πŸ“Š Monitoring & Observability

Application Insights Dashboard

  • Performance Metrics: Response times, throughput
  • Error Tracking: Exceptions and failed requests
  • User Analytics: Usage patterns and behavior
  • Dependency Tracking: External service calls

Health Endpoints

  • API Health: GET /health
  • Database Connectivity: GET /health/db
  • AI Services Status: GET /health/ai

🀝 Contributing

We welcome contributions! Here's how to get started:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Test your changes with the deployment script
  4. Commit your changes: git commit -m 'Add amazing feature'
  5. Push to the branch: git push origin feature/amazing-feature
  6. Open a Pull Request

Development Guidelines

  • Follow existing code style and conventions
  • Add tests for new features
  • Update documentation as needed
  • Ensure deployment script still works

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸŽ‰ What You Get After Deployment

🌐 Web Application

Modern React interface for document upload and chat

πŸ”Œ API Endpoints

RESTful backend with OpenAPI documentation

πŸ“Š Azure Dashboard

Complete monitoring and management portal

πŸ“ˆ Scalable Infrastructure

Auto-scaling based on demand


Ready to transform your document workflow?

πŸš€ Start Now | πŸ“– Learn More | πŸ’¬ Get Support

Happy Document Chatting! πŸ€–πŸ“„

About

Complete DocGenAI RAG Solution with Azure OpenAI, AI Search, and one-command deployment

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors