Custom trl.SFTTrainer that adds a KL divergence loss between a LoRA-adapted model and its base model.
-
Updated
Jul 25, 2025 - Python
Custom trl.SFTTrainer that adds a KL divergence loss between a LoRA-adapted model and its base model.
This project explores the implementation of active learning techniques, focusing on various query strategies to optimize the selection of informative data points for model training. It aims to reduce the amount of labeled data required while improving model performance, especially in scenarios with limited labeled data.
Add a description, image, and links to the kldivergence topic page so that developers can more easily learn about it.
To associate your repository with the kldivergence topic, visit your repo's landing page and select "manage topics."