Skip to content

kushalc/nd-lora

Repository files navigation

ND-LoRA: Neural Diversity Low-Rank Adaptation

Neural Diversity Regularizes Hallucinations in Small Language Models

πŸ”₯Β Key Results | πŸ’‘Β Paper (arXiv) | πŸ“šΒ Citation

Overview

ND-LoRA implements Neural Diversity Low-Rank Adaptation, a novel training method that combines stream-specific LoRA adapters with Barlow Twins regularization to reduce hallucinations in small language models. Our approach achieves significant improvements in factuality and faithfulness across multiple benchmarks while maintaining model quality.

Key Results

  • 15-25% reduction in hallucination rates on TruthfulQA, HaluEval, and MemoTrap benchmarks
  • Parameter-efficient: Only 0.5-2% additional parameters compared to base model
  • Causally validated: Neural diversity causally reduces hallucinations (p < 0.001)

Installation

Requirements

  • Python 3.9+
  • PyTorch 2.0+ with CUDA or MPS support
  • 16GB+ RAM (32GB recommended)

Setup

# Clone repository
git clone https://github.com/kushalc/nd-lora.git
cd nd-lora

# Install dependencies
pip install -r requirements.txt

# Initialize ParScale submodule
git submodule update --init --recursive

Quick Start

Training

# Train ND-LoRA model with P=4 streams
python train_ndlora.py \
  --P=4 \
  --use-stream-lora \
  --orthogonal-lora \
  --bt-normalization-warmup \
  --target-tokens=20_000_000

# Or use Modal for distributed training
modal run train_ndlora::modal__nslP4__OptC9

Evaluation

# Run hallucination benchmarks
cd leaderboard
python backend_cli.py --model YOUR_MODEL_PATH

# Or use evaluation scripts
python eval_experiments.py --checkpoint PATH_TO_CHECKPOINT
python eval_neurodiversity.py --checkpoint PATH_TO_CHECKPOINT

Model Downloads

Pre-trained model checkpoints are available for all configurations reported in the paper:

  • Baselines: Qwen2.5-0.5B with P=1 (R=32/64/128)
  • ParScale: P=2/4/8 with shared LoRA and Barlow Twins
  • ND-LoRA: P=2/4/8 with stream-specific LoRA and optimized regularization
  • Ablations: Module ablations, architectural variants

See utils/model_checkpoints_paper.py for checkpoint paths and configurations.

Using Model Checkpoints

The model_checkpoints_paper.py module provides organized access to all paper-essential model checkpoints:

from utils.model_checkpoints_paper import (
    CORE_CHECKPOINTS,      # Main results (Tables 1, 7, 8, 9)
    ABLATION_CHECKPOINTS,  # Ablation studies (Table 4)
    MODULE_ABLATION_CHECKPOINTS,  # Module ablations (Table 6)
    ALL_CHECKPOINTS,       # Combined dictionary
    MODEL_NAMES,          # Human-readable names
    BASE_CHECKPOINTS      # Base model paths
)

# Access checkpoint paths
checkpoint_path = CORE_CHECKPOINTS["ND-LoRA_P4"]  # S3 path for ND-LoRA P=4 model
model_name = MODEL_NAMES["ND-LoRA_P4"]  # "ND-LoRA (P=4, OptC9)"

# Use with evaluation scripts
python analyze_experiments.py --model-whitelist nd-lora/
python eval_experiments.py --checkpoint CHECKPOINT_PATH

Reading Evaluation Results

The analyze_experiments.py script can read evaluation results from evals-* directories and generate publication-ready plots:

# Generate analysis plots from evaluation results
python analyze_experiments.py \
  --results-base-path leaderboard \
  --output-dir plots \
  --plot-mode all pub \
  --analysis-mode full \
  --baseline-mode single-stream

# View generated plots
open plots/pub-full-single-stream-relative.png

The script automatically:

  • Reads from leaderboard/evals-{analysis_mode}/ directories
  • Maps raw S3 checkpoint paths to human-readable model names using MODEL_NAMES
  • Generates absolute and relative performance heatmaps
  • Creates model-level and evaluation-level summary statistics

Note: Checkpoints will be migrated to public hosting soon. Check back for updated URLs.

Reproducing Paper Results

All experiments in the paper can be reproduced using Modal for distributed execution:

Core Results (Tables 1, 7, 8, 9)

# P=1 baselines (parameter-matched)
modal run train_ndlora::modal__P1__r32
modal run train_ndlora::modal__P1__r64
modal run train_ndlora::modal__P1__r128

# ParScale baselines
modal run train_ndlora::modal__P2__r32
modal run train_ndlora::modal__P4__r64
modal run train_ndlora::modal__P8__r128

# ND-LoRA main results (Optuna-optimized)
modal run train_ndlora::modal__nslP2__OptC9
modal run train_ndlora::modal__nslP4__OptC9
modal run train_ndlora::modal__nslP8__OptC9

Ablation Studies (Tables 4, 6)

# Component ablations
modal run train_ndlora::modal__lP4__r64      # ParScale-BT
modal run train_ndlora::modal__sP4           # Stream-LoRA
modal run train_ndlora::modal__slP4          # Stream-LoRA-BT
modal run train_ndlora::modal__nslP4         # ND-LoRA (original HP)

# Module ablations
modal run train_ndlora::modal__p4_nOSL_ablation__modules

Evaluation

# Deep evaluation (N=1024 samples per task)
cd leaderboard
python eval-cli.py --checkpoint CHECKPOINT_PATH --mode deep

# Corruption experiments for causality analysis
python eval_neurodiversity.py \
  --checkpoint CHECKPOINT_PATH \
  --corruption-methods substitute_tokens substitute_streams \
  --n-samples 128

Architecture

ND-LoRA Components

  1. Parallel Streams (P): Multiple computation paths through the model
  2. Stream-Specific LoRA: Independent low-rank adapters for each stream
  3. Barlow Twins Regularization: Decorrelation loss to maintain neural diversity
  4. Optimized Hyperparameters: Ξ»_BT tuned via Optuna for each P value

Key Hyperparameters

Parameter P=2 P=4 P=8
LoRA Rank 16 16 16
Ξ»_BT 0.29 0.58 0.13
Design Layer 20 20 20
LoRA Modules q,k,v q,k,v q,k,v

Repository Structure

nd-lora/
β”œβ”€β”€ train_ndlora.py           # Main training script with Modal entrypoints
β”œβ”€β”€ eval_experiments.py          # Hallucination benchmark evaluation
β”œβ”€β”€ eval_neurodiversity.py       # Causality experiments (corruption analysis)
β”œβ”€β”€ ParScale/                    # Core ParScale implementation (submodule)
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ model_checkpoints_paper.py  # Paper-essential model checkpoints
β”‚   β”œβ”€β”€ model_utils.py           # Model loading and PEFT setup
β”‚   β”œβ”€β”€ stream_diagnostics.py    # Stream analysis and monitoring
β”‚   └── ...                      # Other utilities
β”œβ”€β”€ leaderboard/                 # Hallucination evaluation framework
β”‚   β”œβ”€β”€ backend_cli.py           # Evaluation worker
β”‚   β”œβ”€β”€ app.py                   # Gradio web interface
β”‚   └── src/backend/tasks/       # Custom evaluation tasks
β”œβ”€β”€ paper/                       # LaTeX source for paper
└── docs/                        # Implementation documentation

Modal Integration

This project uses Modal for running experiments and evaluations. Modal entrypoints in train_ndlora.py allow distributed training across cloud GPUs.

Setting up Modal

# Install Modal CLI
pip install modal

# Authenticate
modal token new

# Run experiment
modal run train_ndlora::modal__nslP4__OptC9

Citation

If you use this code or find our work helpful, please cite:

@article{chakrabarti2025neurodiversity,
  title={Neural Diversity Regularizes Hallucinations in Small Language Models},
  author={Chakrabarti, Kushal and Balachundhar, Nirmal},
  journal={arXiv preprint arXiv:2510.20690},
  year={2025}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Contact

For questions or issues, please open a GitHub issue or contact the authors.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •