LLM Training Agent for SFT/GRPO/DPO workflows.
agent/- Core orchestration logictools/- Individual tool implementationsworkflows/- Workflow definitionsui/- Streamlit interfaceconfig/- Configuration filesscripts/- Executable scripts
# Create CPU virtual environment
python -m venv venv-cpu
source venv-cpu/bin/activate # On Windows: venv-cpu\Scripts\activate
# Install CPU-only dependencies
pip install -r requirements-cpu.txt
# Run pipeline
python -m agent.main \
--data_path "your-dataset" \
--out_dir "./output/run1" \
--max_iters 3
# Or use UI
streamlit run ui/app.py# Create GPU virtual environment
python -m venv venv-gpu
source venv-gpu/bin/activate # On Windows: venv-gpu\Scripts\activate
# Install GPU dependencies
pip install -r requirements.txt
# Run pipeline
python -m agent.main \
--data_path "your-dataset" \
--out_dir "./output/run1" \
--max_iters 3- SFT (Supervised Fine-Tuning)
- GRPO (Group Relative Policy Optimization)
- DPO (Direct Preference Optimization)