MediTalk - Medical AI with Voice

MediTalk is a research project that gives the MultiMeditron medical LLM model from LiGHT Laboratory natural conversational speech capabilities, enabling voice-based medical interactions.

Prerequisites

Python 3.10+
48GB+ RAM
HuggingFace token (get one)
Access to meta-llama/Meta-Llama-3.1-8B-Instruct
Access to ClosedMeditron/Mulimeditron-End2End-CLIP-medical (request from EPFL LiGHT lab)

Setup

Create .env file:

HUGGINGFACE_TOKEN=your_token
MULTIMEDITRON_HF_TOKEN=your_token
MULTIMEDITRON_MODEL=ClosedMeditron/Mulimeditron-End2End-CLIP-medical
ORPHEUS_MODEL=canopylabs/orpheus-3b-0.1-ft
WHISPER_MODEL=base

Deployment

Setup environments (first run only):

./scripts/setup-local.sh

Start all services:

./scripts/start-local.sh

Access web interface: http://localhost:8503

Stop services:

./scripts/stop-local.sh

Monitor:

./scripts/health-check.sh
tail -f logs/controller.log

RCP Cluster Deployment

For deployment on the EPFL RCP cluster, please refer to the LiGHT RCP Documentation for setup instructions. After setting up the environment, you can clone this repository and follow the deployment steps above to run MediTalk on the cluster.

Services Architecture

Service	Port	Description
Controller	8000	Orchestrates LLM, TTS, STT services
Web UI	8503	Streamlit interface
MultiMeditron	5009	Medical AI model
Whisper	5007	Speech-to-text
Orpheus	5005	Text-to-speech
Bark	5008	Text-to-speech
CSM	5010	Conversational Text-to-speech
Qwen3-Omni	5014	Conversational Text-to-speech

Troubleshooting

Service won't start:

tail -f logs/<service>.log

Check for errors, missing dependencies or missing tokens.

Note: Some services may take several minutes to load models on first run.

Missing ffmpeg:

sudo apt-get update && sudo apt-get install -y ffmpeg
./scripts/restart.sh

Model loading fails:

Verify HuggingFace token in .env
Check disk space (models are large)
Review service logs in logs/ directory

Project Structure

MediTalk/
├── .env                      # Environment configuration
├── scripts/                  # Service management scripts
│   ├── setup-local.sh        # Install dependencies
│   ├── start-local.sh        # Start all services
│   ├── stop-local.sh         # Stop all services
│   ├── restart.sh            # Restart services
│   ├── health-check.sh       # Check service health
│   └── monitor-gpus.sh       # GPU monitoring
├── services/
│   ├── controller/           # Orchestration service
│   ├── modelMultiMeditron/   # Medical LLM
│   ├── modelWhisper/         # Speech-to-text
│   ├── modelOrpheus/         # TTS
│   ├── modelBark/            # TTS
│   ├── modelCSM/             # TTS (conversational)
│   ├── modelQwen3Omni/       # TTS (conversational)
│   └── webui/                # Streamlit interface
├── inputs/                   # Input files (conversations json files)
├── outputs/                  # Generated audio files
└── logs/                     # Service logs

Acknowledgments

[MultiMeditron] - EPFL LiGHT Lab
Orpheus - Canopy Labs
Bark - Suno AI
Whisper - OpenAI
Qwen3-Omni - Alibaba Cloud
FastAPI & Streamlit

Semester Project | Nicolas Teissier | LiGHT Laboratory | EPFL

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
data		data
outputs		outputs
scripts		scripts
services		services
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MediTalk - Medical AI with Voice

Prerequisites

Setup

Deployment

RCP Cluster Deployment

Services Architecture

Troubleshooting

Project Structure

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

EPFLiGHT/MediTalk

Folders and files

Latest commit

History

Repository files navigation

MediTalk - Medical AI with Voice

Prerequisites

Setup

Deployment

RCP Cluster Deployment

Services Architecture

Troubleshooting

Project Structure

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages