Live Demo: https://smartdocq.vercel.app
In today's information-driven world, efficiently extracting insights from documents is crucial for academic success and professional productivity. The growing volume of digital documents presents challenges in comprehension, knowledge retention, and information retrieval. SmartDocQ is an intelligent document processing platform that leverages advanced AI technology to transform how users interact with their documents.
SmartDocQ is a comprehensive full-stack web application that enables users to upload documents, engage with content through natural language queries, and generate educational resources automatically. By combining Retrieval-Augmented Generation (RAG) with Google's Gemini AI, the platform delivers accurate, context-aware responses while maintaining document privacy and security.
- Document Upload & Processing: Support for PDF, DOCX, and TXT files with intelligent text extraction and preprocessing
- AI-Powered Chat: Interactive question-answering system that provides context-aware responses based on uploaded documents
- Quiz Generation: Automatic creation of multiple-choice, true/false, and short-answer questions from document content
- Flashcard Creation: Smart extraction of key concepts and definitions for effective learning and revision
- Text Summarization: Concise summaries of document content for quick comprehension
- Sensitive Data Detection: Automatic identification of personal information (emails, phone numbers, Aadhaar, PAN, credit cards, SSN)
- User Consent Workflow: Privacy-first approach requiring explicit consent before processing sensitive documents
- Content Moderation: Profanity filtering and URL validation to maintain platform integrity
- httpOnly Cookie Authentication: Secure user sessions with role-based access control (User, Admin, Moderator)
- Centralized Server-Side Validation: Auth and admin APIs validate all inputs with Zod schemas before any business logic or database access.
- Strict Admin Authorization: Admin endpoints are protected by middleware that requires an authenticated user with
isAdmin = true; there are no hardcoded admin credentials or token backdoors.
- User Management: Comprehensive admin dashboard for user oversight and role assignment
- Document Analytics: Track document uploads, processing status, and usage statistics
- Report Management: Handle user feedback and support inquiries efficiently
- System Monitoring: Real-time logs and performance metrics
- React.js 18.x: Modern component-based UI framework
- React Router DOM: Client-side routing and navigation
- i18next: Internationalization support
- GSAP & Lottie: Smooth animations and interactive elements
- Focus Trap React: Accessibility features
- Node.js & Express 5.x: RESTful API server
- Mongoose 8.x: MongoDB object modeling
- JWT & bcryptjs: Authentication and password security
- Multer: File upload handling
- CORS: Cross-origin resource sharing configuration
- Flask 3.x: Python web framework for AI processing
- Google Gemini 2.5 Flash: Advanced text generation and comprehension
- Text-Embedding-004: High-quality vector embeddings
- ChromaDB 0.5+: Vector database for semantic search
- PyPDF2: PDF text extraction
- python-docx: Microsoft Word document processing
- Better Profanity: Content filtering
- MongoDB Atlas: Primary NoSQL database for user data, documents, and chat history
- ChromaDB: Vector store for document embeddings and semantic retrieval
SmartDocQ follows a three-tier microservice architecture:
- Presentation Layer: React.js frontend providing responsive user interface
- Business Logic Layer: Node.js/Express middleware handling authentication, routing, and database operations
- AI Processing Layer: Flask service managing document processing, embeddings, and AI interactions
This separation ensures scalability, maintainability, and efficient resource utilization.
To set up SmartDocQ locally, you'll need:
- Node.js: Version 20.x or higher
- Python: Version 3.9 or higher
- MongoDB: Local installation or MongoDB Atlas account
- Google AI API Key: For Gemini AI access
- Git: Version control
- Basic understanding of web development and REST APIs
# Fork the repository on GitHub, then clone your fork
git clone https://github.com/your-username/SmartDocQ.git
cd SmartDocQ# Navigate to servers directory
cd servers
# Install dependencies
npm install
# Create .env file with the following variables:
# PORT=5000
# MONGO_URI=your_mongodb_connection_string
# JWT_SECRET=your_jwt_secret_key
# FRONTEND_ORIGINS=http://localhost:3000
# SERVICE_TOKEN=your_service_token
# FLASK_ASK_URL=http://localhost:5001/api/document/ask
# FLASK_INDEX_URL=http://localhost:5001/api/index-from-atlas
# FLASK_CONVERT_URL=http://localhost:5001/api/convert/word-to-pdf
# Start the server (Node API on http://localhost:5000)
npm start# Navigate to backend directory
cd ../backend
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Create .env file with:
# PORT=5001
# FRONTEND_ORIGINS=http://localhost:3000
# NODE_BASE_URL=http://localhost:5000
# SERVICE_TOKEN=your_service_token
# GEMINI_API_KEY=your_google_ai_api_key
# Start Flask service
python main.py# Navigate to my-app directory
cd ../my-app
# Install dependencies
npm install
# Create .env file with:
# REACT_APP_API_URL=http://localhost:5000
# REACT_APP_PY_API_URL=http://localhost:5001
# REACT_APP_GOOGLE_CLIENT_ID=your_google_oauth_client_id
# Start development server
npm startOpen your browser and navigate to:
http://localhost:3000
The application will be running with:
- Frontend:
http://localhost:3000 - Backend API (Node):
http://localhost:5000 - Flask AI Service:
http://localhost:5001
- Register/Login: Create an account or sign in to access the platform
- Upload Document: Navigate to the upload page and select your document (PDF, DOCX, or TXT)
- Consent Review: If sensitive data is detected, review and provide consent
- Chat: Ask questions about your document and receive AI-powered answers
- Generate Quiz: Create practice questions to test your understanding
- Create Flashcards: Generate study cards for key concepts
- Summarize: Get concise summaries of document sections
- Share: Share chat conversations with others via unique links
POST /api/auth/signup- User registrationPOST /api/auth/login- User login (sets httpOnly cookie)POST /api/auth/logout- Logout (clears cookie)GET /api/auth/verify- Verify session from cookie
POST /api/document/upload- Upload single documentPOST /api/document/upload/batch- Upload multiple documents (up to 10)GET /api/document/my- List user documents (metadata)GET /api/document/:id/download- Download original/converted fileDELETE /api/document/:id- Delete document
GET /api/chat/:documentId- Get or create chat for a documentPOST /api/chat/:documentId/message- Send message and get AI answer (via Flask)POST /api/chat/:documentId/append- Append precomputed messagesPUT /api/chat/:documentId- Overwrite entire chatDELETE /api/chat/:documentId- Delete chat for a documentDELETE /api/chat- Delete all chats for current userPATCH /api/chat/:documentId/message/:index/rating- Rate an assistant messageGET /api/chat/:documentId/export.pdf- Export chat as PDF
GET /api/admin/users- List all usersPUT /api/admin/users/:id/role- Update user roleGET /api/admin/stats- System statistics
We welcome contributions from the community! Here's how you can help:
- Fork the Repository: Click the "Fork" button at the top of this page
- Create Feature Branch:
git checkout -b feature/your-feature-name - Make Changes: Implement your feature or bug fix
- Test Thoroughly: Ensure all existing tests pass and add new tests if needed
- Commit Changes:
git commit -m "Add meaningful commit message" - Push to Branch:
git push origin feature/your-feature-name - Submit Pull Request: Open a PR with a clear description of your changes
- Follow existing code style and conventions
- Write clear, descriptive commit messages
- Update documentation for any API or feature changes
- Add unit tests for new functionality
- Ensure no sensitive data or API keys are committed
# Run backend tests
cd servers
npm test
# Run frontend tests
cd my-app
npm test
# Run Python tests
cd backend
pytestSmartDocQ can be deployed on various platforms:
- Frontend: Vercel, Netlify, or AWS Amplify
- Backend: Heroku, Railway, or AWS EC2
- AI Service: Heroku, Render, or Google Cloud Run
- Database: MongoDB Atlas (recommended)
Refer to DEPLOYMENT_CHECKLIST.md for detailed deployment instructions.
- All passwords are hashed using bcrypt with salt rounds
- httpOnly Cookie Authentication: JWT tokens stored in secure httpOnly cookies to prevent XSS attacks
- Cookies configured with
SameSiteandSecureflags in production - Client-side user data validated with
safeParseUser()to prevent corrupted/malicious data - JWT tokens expire after 1 hour with automatic cleanup on logout
- Sensitive data detection runs before document processing
- Content moderation filters inappropriate content
- Shared chat links use high-entropy IDs, expire (~24h), and are rate-limited on the public endpoints
- CORS configured with credentials support for specific allowed origins
- Environment variables store sensitive configuration
- Cross-tab authentication sync for consistent session state
- Multilingual Support: Document processing and AI responses in multiple languages
- Advanced Analytics: Detailed insights on document usage and learning patterns
- Collaborative Features: Shared workspaces and team document libraries
- Mobile Application: Native iOS and Android apps
- Integration APIs: Connect with learning management systems (LMS)
- Voice Interaction: Voice-based queries and responses
- Offline Mode: Local document processing without internet
Special thanks to:
- Google AI team for Gemini API access
- The open-source community for excellent libraries and frameworks
- Contributors who have helped improve this project
For questions, issues, or feature requests:
- Issues: GitHub Issues
- Email: smartdocq@gmail.com
Thanks to all the contributors who have helped build SmartDocQ:
Dr-Venom29 |
ANIRUDH-7600 |
sameekhsa |
ananya-1507 |
srithi-05 |