A powerful and intuitive platform that enables users to interact with their documents through natural language conversations, leverage AI for deep insights, perform intelligent modifications of structured data, and visualize extracted information. This application is designed to enhance document comprehension, analysis, and collaboration by bridging the gap between raw data and actionable intelligence.
- Multi-Document Analysis: Upload and analyze multiple PDF documents, extracting diverse content types for comprehensive insights.
- Intelligent AI Chat Interface: Engage in natural language conversations with your documents. The AI provides insights, summaries, answers to queries, and context-aware explanations based on document content.
- Document Preview & Interaction: Offers a real-time preview of uploaded documents, allowing users to navigate content and interact directly, potentially highlighting sections for focused AI analysis.
- Advanced Content Extraction: Utilizes sophisticated processing to extract
TextChunk(raw text),Table(structured tabular data), andImage(visual elements) from documents, forming the foundation for AI analysis. - AI-Driven Table Modification: Request the AI to modify tables within documents based on natural language commands. The system generates structured output including a modification summary, a clean modified table, and a detailed change log.
- High-Performance Table Rendering: Efficiently displays even very large tables using frontend virtualization techniques, ensuring smooth scrolling and optimal performance without impacting browser responsiveness.
- Data Export & Download: Download modified tables directly from the chat interface as Excel files for further analysis or as Markdown files for easy sharing.
- Interactive Visualizations: Provides capabilities for generating and displaying visual representations of extracted data and information retrieval results, with a focus on user-specific insights.
- Conversation Management: Manage chat histories, with potential features for saving, clearing, and forking conversations to explore different discussion paths.
- Responsive & Intuitive UI: Features a clean, modern, and responsive user interface designed for a seamless and engaging user experience.
- Robust File Handling: Leverages Cloudinary for secure and efficient storage and delivery of all document assets.
This platform is built with a modern and robust technology stack, ensuring scalability, performance, and maintainability.
- Frontend:
- Next.js: React-based framework for server-side rendering, routing, and efficient frontend development.
- React: JavaScript library for building user interfaces.
- TypeScript: Strongly-typed superset of JavaScript for enhanced code quality and maintainability.
- Tailwind CSS: Utility-first CSS framework for rapid and consistent UI styling.
- Zustand: Fast and lightweight state management solution for React applications.
- Axios: Promise-based HTTP client for making API requests.
react-window: Library for efficient virtualization of large lists and tabular data.react-markdown: React component to render Markdown as React components.remark-gfm:remarkplugin to support GitHub Flavored Markdown (GFM), essential for table rendering.
- Backend:
- Python: Primary programming language for backend services.
- FastAPI: Modern, fast (high-performance) web framework for building APIs with Python 3.7+ based on standard Python type hints.
- LangChain: Framework for developing applications powered by large language models.
- Beanie: Asynchronous MongoDB ODM for Python, simplifying database interactions.
- MongoDB: NoSQL database for flexible and scalable data storage.
- AI/LLM:
- Integrated with various Large Language Models (LLMs) (e.g., OpenAI, Groq, Llama3, Mixtral, Gemma) for conversational AI and advanced analysis.
- Cloud Services:
- Cloudinary: Cloud-based service for image and video management, used for storing and delivering document assets.
The platform follows a clear separation of concerns, divided into frontend, backend, and a core document processing pipeline.
- Next.js Application: Serves as the web application framework, providing routing, API routes, and static asset serving.
- Component-Based UI: Built with reusable React components (e.g.,
Button,Layout,LoadingSpinner) to ensure modularity and ease of development.MessageBubble: The core component for displaying chat messages. It intelligently renders different content types:- Standard text and small tables using
ReactMarkdownwithremark-gfm. - Large tables using a dedicated
VirtualizedTablecomponent for performance. - Displays rich metadata like download buttons for modified tables.
- Standard text and small tables using
VirtualizedTable: A specialized component utilizingreact-windowto render large tabular data efficiently, ensuring smooth scrolling.DocumentToolbar: Handles interactive elements related to document viewing (e.g., zoom controls).
- State Management: Utilizes
useChatStore(Zustand) for global state management related to chat sessions, messages, and application-wide settings. - API Client: An
api.tsutility (Axios-based) centralizes all frontend-to-backend communication. - Styling: Consistent and responsive styling achieved with Tailwind CSS.
- FastAPI Application: The core API layer that handles all incoming requests and routes them to appropriate services.
MultiChatService: The central service for all chat-related functionalities:- Manages
ChatSessionandChatMessagelifecycles. - Integrates with LLMs for generating conversational responses and performing advanced analytical tasks.
- Handles AI-driven table modifications, parsing complex LLM outputs into structured data (summary, change log, modified table Markdown).
- Manages chat history and context for LLM interactions.
- Manages
ContentExtractorService(Conceptual): Responsible for the initial processing of uploaded documents:- Parses document content into structured data models:
TextChunk,Table, andImage. - These extracted components are stored in MongoDB and serve as the analytical basis for the chat service.
- Parses document content into structured data models:
- Database Models: Uses Beanie to define models for
User,Document,TextChunk,Table,Image,ChatSession, andChatMessage, persisted in MongoDB. - Authentication: Secure user authentication handled via standard API practices.
- File Storage Integration: Interacts with Cloudinary for robust storage and delivery of all document assets.
- Download Service: An
EnhancedTableDownloadService(or similar logic) in the API layer facilitates the export of modified tables into formats like Excel or Markdown.
- Upload: Users upload documents (e.g., PDFs).
- Extraction: The
ContentExtractorServiceprocesses the document, extracting:- Text Chunks: Paragraphs, sentences, or other raw text units.
- Tables: Structured data identified within the document.
- Images: Visual elements detected.
- Storage: Extracted data is stored in MongoDB, linked to the original
Documentrecord. - Analysis: This structured data (
TextChunk,Table,Image) becomes the input for theMultiChatServiceand other analytical functions. Chat sessions can be initiated for a document as long as the document exists, regardless of the presence of specific content types.
To get a local copy up and running, follow these simple steps.
- Node.js (LTS version recommended)
- Python 3.9+
- MongoDB instance (local or cloud-hosted)
- Cloudinary Account (with API Key, API Secret, Cloud Name)
- LLM Provider Account (e.g., OpenAI, Groq)
-
Clone the repository:
git clone [your-repository-url] cd [your-repository-name] -
Backend Setup:
cd backend # or the directory containing your FastAPI app pip install -r requirements.txtCreate a
.envfile in your backend root directory and configure environment variables (replace placeholders with your actual credentials):MONGO_URI="mongodb://localhost:27017/your_db_name" CLOUD_NAME="your_cloudinary_cloud_name" API_KEY="your_cloudinary_api_key" API_SECRET="your_cloudinary_api_secret" LLM_API_KEY="your_llm_api_key" # Add any other environment variables as required by your specific LLM or services -
Frontend Setup:
cd frontend # or the directory containing your Next.js app npm install # or yarn installCreate a
.env.localfile in your frontend root directory:NEXT_PUBLIC_BACKEND_URL="http://localhost:8000" # Or your backend's URL
-
Start the Backend:
cd backend uvicorn main:app --reload(Adjust
main:appif your FastAPI app is named differently) -
Start the Frontend:
cd frontend npm run dev # or yarn dev
The application should now be accessible in your browser, typically at http://localhost:3000.
- Upload Documents: Navigate to the document management section to upload your PDF files.
- Start a Chat: Select a document and choose to start either a "General Chat" or an "Analytical Chat" based on your needs.
- Interact with AI: Type your questions or commands in natural language.
- Modify Tables: Ask the AI to make specific changes to tables, like "change the CGPA for 2027 to 10".
- Download Results: If a table is modified, a download button will appear, allowing you to export the modified table as an Excel file.
We welcome contributions! If you'd like to improve this project, please follow these steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature/your-feature-name). - Make your changes.
- Commit your changes (
git commit -m 'Add new feature'). - Push to the branch (
git push origin feature/your-feature-name). - Open a Pull Request.
This project is licensed under the [Your Chosen License, e.g., MIT License]. See the LICENSE file for details.