Skip to content

joeyllm/projects-modelworks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

55 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ModelWorks

This repository contains two AI applications developed by the ModelWorks team:

🌿 Branch Overview

main - RAG-Powered Document Q&A System

A full-stack application that enables users to upload PDF documents and ask questions about their content using Retrieval-Augmented Generation (RAG) with local LLM models.

clinical-trials-chat-api - Clinical Trial Analytics Dashboard

A Next.js application for clinical trial data visualization and AI-powered analysis of patient data, biomarkers, and trial outcomes.

πŸš€ Features

  • PDF Document Upload: Drag-and-drop interface for uploading PDF files
  • Intelligent Q&A: Ask questions about uploaded documents using RAG
  • Local LLM Integration: Uses Ollama with DeepSeek R1 (1.5B) model
  • Vector Database: ChromaDB for document storage and retrieval
  • Modern UI: Responsive web interface with dark/light theme toggle
  • Real-time Chat: Interactive chat interface for document queries
  • Dockerized: Complete containerized setup for easy deployment

πŸ—οΈ Architecture

The system consists of three main components:

  • Frontend: HTML/CSS/JavaScript web interface
  • Backend: FastAPI server with RAG pipeline
  • Ollama: Local LLM service for text generation and embeddings

Tech Stack

  • Backend: FastAPI, Python, LangChain
  • Frontend: HTML5, CSS3, JavaScript (Vanilla)
  • LLM: Ollama with DeepSeek R1 (1.5B)
  • Embeddings: Nomic Embed Text
  • Vector DB: ChromaDB
  • PDF Processing: pdfplumber
  • Containerization: Docker & Docker Compose

πŸ“‹ Prerequisites

  • Docker and Docker Compose
  • Git

πŸš€ Quick Start

  1. Clone the repository:

    git clone <repository-url>
    cd projects-modelworks
  2. Make the init script executable:

    chmod +x init.sh
  3. Run the initialization script:

    ./init.sh

This script will:

  • Start all Docker containers
  • Download the required LLM models (DeepSeek R1 1.5B and Nomic Embed Text)
  • Set up the vector database
  1. Access the application:

πŸ“– Usage

  1. Upload a PDF: Drag and drop a PDF file into the upload area
  2. Ask Questions: Type your questions about the document content
  3. Get Answers: The system will use RAG to provide contextually relevant answers

πŸ”§ API Endpoints

  • POST /upload: Upload PDF files for processing
  • POST /ask: Send questions about uploaded documents

πŸ“ Project Structure

β”œβ”€β”€ api/                    # API configuration
β”œβ”€β”€ backend/                # FastAPI backend
β”‚   β”œβ”€β”€ components/         # Core functionality modules
β”‚   β”‚   β”œβ”€β”€ database.py    # ChromaDB integration
β”‚   β”‚   β”œβ”€β”€ embedding.py   # Embedding model setup
β”‚   β”‚   β”œβ”€β”€ history.py     # Conversation history
β”‚   β”‚   β”œβ”€β”€ llm.py         # LLM integration
β”‚   β”‚   β”œβ”€β”€ pipeline.py    # RAG pipeline orchestration
β”‚   β”‚   β”œβ”€β”€ retrieve.py    # Document retrieval
β”‚   β”‚   β”œβ”€β”€ store_text.py  # PDF text extraction
β”‚   β”‚   └── upload.py      # File upload handling
β”‚   β”œβ”€β”€ api_handler.py     # FastAPI application
β”‚   └── requirements.txt   # Python dependencies
β”œβ”€β”€ frontend/              # Web interface
β”‚   β”œβ”€β”€ index.html        # Main page
β”‚   └── script.js         # Frontend logic
β”œβ”€β”€ docker-compose.yml     # Container orchestration
└── init.sh               # Setup script

πŸ› οΈ Development

Backend Development

The backend uses a modular architecture with separate components for:

  • Pipeline: Orchestrates the RAG workflow
  • Database: Manages ChromaDB vector storage
  • LLM: Handles text generation with Ollama
  • Embedding: Manages text embeddings
  • Retrieve: Implements similarity search
  • Upload: Processes PDF files
  • History: Manages conversation context

Frontend Development

The frontend is built with vanilla JavaScript and provides:

  • Drag-and-drop PDF upload
  • Real-time chat interface
  • Theme switching (dark/light mode)
  • Responsive design

πŸ”’ Model Storage

Models are stored in a Docker volume (ollama-data) and persist across container restarts. To avoid losing models:

  • βœ… Use docker-compose down (keeps volumes)
  • ❌ Avoid docker-compose down -v (deletes volumes)
  • ❌ Avoid docker volume rm ollama-data

πŸ› Troubleshooting

Common Issues

  1. Models not downloading: Ensure Ollama service is running and accessible
  2. Upload failures: Check that the backend service is running on port 7860
  3. No responses: Verify that the LLM model is properly loaded in Ollama

Logs

Check container logs:

docker-compose logs backend
docker-compose logs ollama

🧬 Clinical Trials Branch (clinical-trials-chat-api)

Overview

A sophisticated clinical trial analytics dashboard built with Next.js, featuring real-time patient data visualization, biomarker analysis, and AI-powered insights for the SCAI-001 immunotherapy study.

Key Features

  • Patient Matching Algorithm: Real-time patient enrollment and screening status
  • Biomarker Analysis: EGFR mutations, PD-L1 expression, and ALK fusion analysis
  • Tumor Response Metrics: Complete/Partial/Stable/Progressive disease tracking
  • Patient Outcomes: Quality of Life (QoL) score monitoring and trends
  • Safety & Compliance: Adverse events tracking and protocol deviation monitoring
  • AI Research Assistant: GPT-4 powered chat interface for trial data analysis
  • Data Export: PDF report generation and JSON data export

Tech Stack

  • Frontend: Next.js 15, React 18, TypeScript
  • Styling: Tailwind CSS 4, Lucide React icons
  • Charts: Recharts for data visualization
  • AI Integration: OpenAI GPT-4 API
  • Data: JSON-based patient dataset with 100+ mock patients

Project Structure

app/
β”œβ”€β”€ api/
β”‚   β”œβ”€β”€ chat/route.ts          # OpenAI API integration
β”‚   └── data/                  # Patient datasets
β”œβ”€β”€ components/
β”‚   β”œβ”€β”€ ClinicalTrialDemoMain.jsx  # Main dashboard component
β”‚   β”œβ”€β”€ Biomarkers.jsx         # Biomarker analysis
β”‚   β”œβ”€β”€ PatientMatching.jsx    # Patient enrollment
β”‚   β”œβ”€β”€ TumorMetrics.jsx      # Tumor response tracking
β”‚   β”œβ”€β”€ OutcomesSafety.jsx    # Safety monitoring
β”‚   └── ChatDocsInterface.jsx  # AI chat interface
β”œβ”€β”€ globals.css               # Global styles
β”œβ”€β”€ layout.js                 # App layout
└── page.jsx                  # Main page

Key Features

1. Patient Matching Dashboard

  • Real-time enrollment status tracking
  • Geographic distribution across Australian sites
  • Screening pipeline management
  • Eligibility criteria monitoring

2. Biomarker Analysis

  • EGFR mutation status visualization (67% positive)
  • PD-L1 expression levels (47.8% average)
  • ALK fusion analysis
  • Interactive progress bars and charts

3. Tumor Response Tracking

  • RECIST criteria compliance
  • Complete Response (CR), Partial Response (PR), Stable Disease (SD), Progressive Disease (PD)
  • Tumor size change visualization
  • Real-time metrics dashboard

4. Patient Outcomes & QoL

  • Baseline vs current Quality of Life scores
  • Improvement tracking (+8.1 average improvement)
  • Individual patient progress monitoring
  • Trend analysis

5. Safety & Compliance

  • Severe adverse events tracking (Grade 3+)
  • Protocol deviation monitoring
  • Secondary infection tracking (COVID-19, Pneumonia, Sepsis)
  • Compliance rate calculation (98.7%)

6. AI Research Assistant

  • GPT-4 powered chat interface
  • Clinical trial data analysis
  • Patient pattern recognition
  • Research insights generation

Setup Instructions

  1. Switch to the clinical trials branch:

    git checkout clinical-trials-chat-api
  2. Install dependencies:

    npm install
  3. Set up environment variables: Create .env.local:

    OPENAI_API_KEY=your_openai_api_key_here
  4. Run the development server:

    npm run dev
  5. Access the application: Open http://localhost:3000

Data Structure

The application uses a comprehensive patient dataset with the following fields:

  • Demographics: Age, sex, location
  • Clinical: Cancer type, stage, enrollment status
  • Biomarkers: EGFR mutation, PD-L1 expression, ALK fusion
  • Outcomes: Tumor response, QoL scores, adverse events
  • Safety: Protocol deviations, infections, compliance

API Integration

  • OpenAI GPT-4: For AI-powered clinical insights
  • Data Export: PDF reports and JSON data downloads
  • Real-time Updates: Live dashboard metrics

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test thoroughly
  5. Submit a pull request

πŸ‘₯ Team

  • Tom – Bachelor of Advanced Computing - Scribe
  • Xuan – Bachelor of Software Engineering – Project Manager
  • Jana – Bachelor of Software Engineering - Monitor
  • Arnav – Bachelor of Software Engineering - Spokesperson
  • Scarlett – Master of Computing - Checker
  • Josh – Bachelor of Computing - Deputy
  • Jaylee – Master of Computing - Coordinator

ModelWorks Β© 2025 - Bridging AI research with practical applications

Releases

No releases published

Packages

No packages published

Contributors 7