Skip to content

MediTalk - Research framework for giving Meditron medical AI natural conversational speech capabilities.

Notifications You must be signed in to change notification settings

EPFLiGHT/MediTalk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MediTalk - Medical AI with Voice

Python 3.10+ Research

MediTalk is a research project that gives the MultiMeditron medical LLM model from LiGHT Laboratory natural conversational speech capabilities, enabling voice-based medical interactions.

Prerequisites

Setup

Create .env file:

HUGGINGFACE_TOKEN=your_token
MULTIMEDITRON_HF_TOKEN=your_token
MULTIMEDITRON_MODEL=ClosedMeditron/Mulimeditron-End2End-CLIP-medical
ORPHEUS_MODEL=canopylabs/orpheus-3b-0.1-ft
WHISPER_MODEL=base

Deployment

Setup environments (first run only):

./scripts/setup-local.sh

Start all services:

./scripts/start-local.sh

Access web interface: http://localhost:8503

Stop services:

./scripts/stop-local.sh

Monitor:

./scripts/health-check.sh
tail -f logs/controller.log

RCP Cluster Deployment

For deployment on the EPFL RCP cluster, please refer to the LiGHT RCP Documentation for setup instructions. After setting up the environment, you can clone this repository and follow the deployment steps above to run MediTalk on the cluster.

Services Architecture

Service Port Description
Controller 8000 Orchestrates LLM, TTS, STT services
Web UI 8503 Streamlit interface
MultiMeditron 5009 Medical AI model
Whisper 5007 Speech-to-text
Orpheus 5005 Text-to-speech
Bark 5008 Text-to-speech
CSM 5010 Conversational Text-to-speech
Qwen3-Omni 5014 Conversational Text-to-speech

Troubleshooting

Service won't start:

tail -f logs/<service>.log

Check for errors, missing dependencies or missing tokens.

Note: Some services may take several minutes to load models on first run.

Missing ffmpeg:

sudo apt-get update && sudo apt-get install -y ffmpeg
./scripts/restart.sh

Model loading fails:

  • Verify HuggingFace token in .env
  • Check disk space (models are large)
  • Review service logs in logs/ directory

Project Structure

MediTalk/
├── .env                      # Environment configuration
├── scripts/                  # Service management scripts
│   ├── setup-local.sh        # Install dependencies
│   ├── start-local.sh        # Start all services
│   ├── stop-local.sh         # Stop all services
│   ├── restart.sh            # Restart services
│   ├── health-check.sh       # Check service health
│   └── monitor-gpus.sh       # GPU monitoring
├── services/
│   ├── controller/           # Orchestration service
│   ├── modelMultiMeditron/   # Medical LLM
│   ├── modelWhisper/         # Speech-to-text
│   ├── modelOrpheus/         # TTS
│   ├── modelBark/            # TTS
│   ├── modelCSM/             # TTS (conversational)
│   ├── modelQwen3Omni/       # TTS (conversational)
│   └── webui/                # Streamlit interface
├── inputs/                   # Input files (conversations json files)
├── outputs/                  # Generated audio files
└── logs/                     # Service logs

Acknowledgments


Semester Project | Nicolas Teissier | LiGHT Laboratory | EPFL

About

MediTalk - Research framework for giving Meditron medical AI natural conversational speech capabilities.

Resources

Stars

Watchers

Forks

Packages

No packages published