This repository implements Weight-Decomposed Low-Rank Adaptation (DoRA), a novel parameter-efficient fine-tuning technique that outperforms the popular LoRA (Low-Rank Adaptation) method by decomposing pre-trained weights into magnitude and directional components.
DoRA is an advanced adaptation technique introduced in Weight-Decomposed Low-Rank Adaptation that enhances the learning capacity and training stability of LoRA by decomposing the pre-trained weight into two components:
- Magnitude component: Captures the norm of weight vectors
- Directional component: Represents the direction of weight vectors
This decomposition allows DoRA to achieve better performance compared to LoRA while maintaining similar computational efficiency.
- Complete implementation of both LoRA and DoRA layers from scratch
- Comparative analysis between vanilla training, LoRA, and DoRA
- Practical demonstration using MNIST dataset with multilayer perceptron
- Performance comparison showing DoRA's superior training dynamics
- Modular design for easy integration into existing models
The LoRA implementation decomposes weight updates into two low-rank matrices A and B:
class LoRALayer(nn.Module):
def __init__(self, in_dim, out_dim, rank, alpha):
# Low-rank decomposition: W + α·B·ADoRA extends LoRA by additionally learning magnitude parameters:
class LinearWithDoRA(nn.Module):
def __init__(self, linear, rank, alpha):
# Weight decomposition: m * (W + α·B·A) / ||W + α·B·A||torch
torchvision
numpy- Clone the repository:
git clone https://github.com/samz905/dora
cd DoRA- Install dependencies:
pip install torch torchvision numpy- Run the notebook:
jupyter notebook DoRA.ipynbFeel free to open issues and pull requests for improvements and bug fixes.
This project is open-source and available under the MIT License.