This project fine-tunes the Qwen2.5-0.5B model on the Counsel Chat dataset to create a mental health counseling assistant.
For macOS (Apple Silicon):
uv syncFor Linux/Windows with CUDA:
uv sync --extra cudaNote: The 0.5B model is small enough to run efficiently on macOS without quantization.
uv run python test_setup.pyuv run python prepare_counsel_dataset.py --max_samples 100uv run python train_qwen_counsel.pyuv run python inference.py --interactiveconfig.json- Training configuration (using Qwen2.5-0.5B for testing)prepare_counsel_dataset.py- Dataset preparation scripttrain_qwen_counsel.py- Main training script with LoRAinference.py- Inference script for testing the trained modeltest_setup.py- Setup verification script
The training uses:
- Model: Qwen2.5-0.5B-Instruct (smallest model for testing)
- Method: LoRA fine-tuning with 4-bit quantization
- Dataset: Counsel Chat (mental health Q&A)
- Batch Size: 8 (optimized for 0.5B model)
- Max Length: 1024 tokens
- Epochs: 2 (for quick testing)
From the Qwen2.5 collection:
- 0.5B (current) - Fastest, least memory
- 1.5B, 3B, 7B, 14B, 32B, 72B - Larger models for better quality
To use a larger model, update model_name in config.json.