This repository contains two AI applications developed by the ModelWorks team:
main - RAG-Powered Document Q&A System
A full-stack application that enables users to upload PDF documents and ask questions about their content using Retrieval-Augmented Generation (RAG) with local LLM models.
clinical-trials-chat-api - Clinical Trial Analytics Dashboard
A Next.js application for clinical trial data visualization and AI-powered analysis of patient data, biomarkers, and trial outcomes.
- PDF Document Upload: Drag-and-drop interface for uploading PDF files
- Intelligent Q&A: Ask questions about uploaded documents using RAG
- Local LLM Integration: Uses Ollama with DeepSeek R1 (1.5B) model
- Vector Database: ChromaDB for document storage and retrieval
- Modern UI: Responsive web interface with dark/light theme toggle
- Real-time Chat: Interactive chat interface for document queries
- Dockerized: Complete containerized setup for easy deployment
The system consists of three main components:
- Frontend: HTML/CSS/JavaScript web interface
- Backend: FastAPI server with RAG pipeline
- Ollama: Local LLM service for text generation and embeddings
- Backend: FastAPI, Python, LangChain
- Frontend: HTML5, CSS3, JavaScript (Vanilla)
- LLM: Ollama with DeepSeek R1 (1.5B)
- Embeddings: Nomic Embed Text
- Vector DB: ChromaDB
- PDF Processing: pdfplumber
- Containerization: Docker & Docker Compose
- Docker and Docker Compose
- Git
-
Clone the repository:
git clone <repository-url> cd projects-modelworks
-
Make the init script executable:
chmod +x init.sh
-
Run the initialization script:
./init.sh
This script will:
- Start all Docker containers
- Download the required LLM models (DeepSeek R1 1.5B and Nomic Embed Text)
- Set up the vector database
- Access the application:
- Frontend: http://localhost:3000
- Backend API: http://localhost:7860
- Ollama: http://localhost:11434
- Upload a PDF: Drag and drop a PDF file into the upload area
- Ask Questions: Type your questions about the document content
- Get Answers: The system will use RAG to provide contextually relevant answers
POST /upload: Upload PDF files for processingPOST /ask: Send questions about uploaded documents
βββ api/ # API configuration
βββ backend/ # FastAPI backend
β βββ components/ # Core functionality modules
β β βββ database.py # ChromaDB integration
β β βββ embedding.py # Embedding model setup
β β βββ history.py # Conversation history
β β βββ llm.py # LLM integration
β β βββ pipeline.py # RAG pipeline orchestration
β β βββ retrieve.py # Document retrieval
β β βββ store_text.py # PDF text extraction
β β βββ upload.py # File upload handling
β βββ api_handler.py # FastAPI application
β βββ requirements.txt # Python dependencies
βββ frontend/ # Web interface
β βββ index.html # Main page
β βββ script.js # Frontend logic
βββ docker-compose.yml # Container orchestration
βββ init.sh # Setup script
The backend uses a modular architecture with separate components for:
- Pipeline: Orchestrates the RAG workflow
- Database: Manages ChromaDB vector storage
- LLM: Handles text generation with Ollama
- Embedding: Manages text embeddings
- Retrieve: Implements similarity search
- Upload: Processes PDF files
- History: Manages conversation context
The frontend is built with vanilla JavaScript and provides:
- Drag-and-drop PDF upload
- Real-time chat interface
- Theme switching (dark/light mode)
- Responsive design
Models are stored in a Docker volume (ollama-data) and persist across container restarts. To avoid losing models:
- β
Use
docker-compose down(keeps volumes) - β Avoid
docker-compose down -v(deletes volumes) - β Avoid
docker volume rm ollama-data
- Models not downloading: Ensure Ollama service is running and accessible
- Upload failures: Check that the backend service is running on port 7860
- No responses: Verify that the LLM model is properly loaded in Ollama
Check container logs:
docker-compose logs backend
docker-compose logs ollama𧬠Clinical Trials Branch (clinical-trials-chat-api)
A sophisticated clinical trial analytics dashboard built with Next.js, featuring real-time patient data visualization, biomarker analysis, and AI-powered insights for the SCAI-001 immunotherapy study.
- Patient Matching Algorithm: Real-time patient enrollment and screening status
- Biomarker Analysis: EGFR mutations, PD-L1 expression, and ALK fusion analysis
- Tumor Response Metrics: Complete/Partial/Stable/Progressive disease tracking
- Patient Outcomes: Quality of Life (QoL) score monitoring and trends
- Safety & Compliance: Adverse events tracking and protocol deviation monitoring
- AI Research Assistant: GPT-4 powered chat interface for trial data analysis
- Data Export: PDF report generation and JSON data export
- Frontend: Next.js 15, React 18, TypeScript
- Styling: Tailwind CSS 4, Lucide React icons
- Charts: Recharts for data visualization
- AI Integration: OpenAI GPT-4 API
- Data: JSON-based patient dataset with 100+ mock patients
app/
βββ api/
β βββ chat/route.ts # OpenAI API integration
β βββ data/ # Patient datasets
βββ components/
β βββ ClinicalTrialDemoMain.jsx # Main dashboard component
β βββ Biomarkers.jsx # Biomarker analysis
β βββ PatientMatching.jsx # Patient enrollment
β βββ TumorMetrics.jsx # Tumor response tracking
β βββ OutcomesSafety.jsx # Safety monitoring
β βββ ChatDocsInterface.jsx # AI chat interface
βββ globals.css # Global styles
βββ layout.js # App layout
βββ page.jsx # Main page
- Real-time enrollment status tracking
- Geographic distribution across Australian sites
- Screening pipeline management
- Eligibility criteria monitoring
- EGFR mutation status visualization (67% positive)
- PD-L1 expression levels (47.8% average)
- ALK fusion analysis
- Interactive progress bars and charts
- RECIST criteria compliance
- Complete Response (CR), Partial Response (PR), Stable Disease (SD), Progressive Disease (PD)
- Tumor size change visualization
- Real-time metrics dashboard
- Baseline vs current Quality of Life scores
- Improvement tracking (+8.1 average improvement)
- Individual patient progress monitoring
- Trend analysis
- Severe adverse events tracking (Grade 3+)
- Protocol deviation monitoring
- Secondary infection tracking (COVID-19, Pneumonia, Sepsis)
- Compliance rate calculation (98.7%)
- GPT-4 powered chat interface
- Clinical trial data analysis
- Patient pattern recognition
- Research insights generation
-
Switch to the clinical trials branch:
git checkout clinical-trials-chat-api
-
Install dependencies:
npm install
-
Set up environment variables: Create
.env.local:OPENAI_API_KEY=your_openai_api_key_here
-
Run the development server:
npm run dev
-
Access the application: Open http://localhost:3000
The application uses a comprehensive patient dataset with the following fields:
- Demographics: Age, sex, location
- Clinical: Cancer type, stage, enrollment status
- Biomarkers: EGFR mutation, PD-L1 expression, ALK fusion
- Outcomes: Tumor response, QoL scores, adverse events
- Safety: Protocol deviations, infections, compliance
- OpenAI GPT-4: For AI-powered clinical insights
- Data Export: PDF reports and JSON data downloads
- Real-time Updates: Live dashboard metrics
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
- Tom β Bachelor of Advanced Computing - Scribe
- Xuan β Bachelor of Software Engineering β Project Manager
- Jana β Bachelor of Software Engineering - Monitor
- Arnav β Bachelor of Software Engineering - Spokesperson
- Scarlett β Master of Computing - Checker
- Josh β Bachelor of Computing - Deputy
- Jaylee β Master of Computing - Coordinator
ModelWorks Β© 2025 - Bridging AI research with practical applications