Skip to content

Comments

Add supervised fine-tuning baseline implementation#4

Open
Vicbi wants to merge 5 commits intomainfrom
add-sft_training
Open

Add supervised fine-tuning baseline implementation#4
Vicbi wants to merge 5 commits intomainfrom
add-sft_training

Conversation

@Vicbi
Copy link
Contributor

@Vicbi Vicbi commented Sep 25, 2025

Add supervised fine-tuning baseline implementation

♻️ Current Situation & Problem

This PR adds supervised fine-tuning (SFT) functionality for causal language models:

  • Addresses the need for instruction-tuning capabilities in the pipeline to fine-tune models on instruction-following tasks.
  • Enables both full fine-tuning and parameter-efficient training via LoRA.

⚙️ Release Notes

Added comprehensive supervised fine-tuning implementation with flexible training options.

  • Training Options: Supports both full fine-tuning and LoRA parameter-efficient training
  • Data Format: Expects JSONL files with instruction, optional input, and output fields.

Usage Examples:

# Full SFT
python sft_baseline.py --model gpt2 --train_file train.jsonl --eval_file dev.jsonl --out_dir ./sft_out

# LoRA training
python sft_baseline.py --model meta-llama/Llama-3.2-8B --train_file train.jsonl \
    --eval_file dev.jsonl --out_dir ./lora_out --use_lora \
    --lora_r 16 --lora_alpha 32 --lora_dropout 0.05

🚩 Next Steps

  • Curate and preprocess dataset for SFT.
  • Select target models for initial experiments.
  • Run benchmarking to validate implementation.

📝 Code of Conduct & Contributing Guidelines

By submitting this pull request, you agree to follow our Coding Guidelines:

@Vicbi Vicbi requested a review from joannalin22 September 25, 2025 00:32
@Vicbi Vicbi closed this Feb 16, 2026
@Vicbi Vicbi reopened this Feb 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants