Skip to content

du-nlp-lab/AALC

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,196 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AALC: Large Language Model Efficient Reasoning via Adaptive Accuracy-Length Control

PaperBlogModelData

The repo is built based on the VERL GitHub repo.

Getting Started

To set up an environment, please follow the following commands:

conda create -n lr python==3.10
conda activate lr
pip install torch==2.6.0 torchvision==0.21.0
pip install flash-attn==2.8.2 --no-build-isolation

git clone https://github.com/du-nlp-lab/AALC
cd AALC
pip install -r requirements.txt
pip install -e . --no-deps

If you have the flash-attn related error, please run the following commands:

pip uninstall flash-attn -y
pip cache remove flash_attn
pip install flash-attn==2.7.4.post1 --no-build-isolation

If you have the apex-related error or want to install apex, please run the following commands out of the AALC folder:

pip uninstall apex -y
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Train and Test Models

To train a model, please confirm all parameters in the file train_grpo_math_LR.sh and then run:

bash train_grpo_math_LP.sh

or (if use length penalty)

bash train_grpo_math_LP_penalty.sh

To test a checkpoint, the procedure is similar to the training part, but the file is test_grpo_math_LR.sh.

bash test_grpo_math_LP.sh

Citation

@article{li2025aalc,
  title={AALC: Large Language Model Efficient Reasoning via Adaptive Accuracy-Length Control},
  author={Li, Ruosen and Luo, Ziming and Zhang, Quan and Li, Ruochen and Zhou, Ben and Payani, Ali and Du, Xinya},
  journal={arXiv preprint arXiv:2506.20160},
  year={2025}
}

About

AALC: Large Language Model Efficient Reasoning via Adaptive Accuracy-Length Control

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 92.3%
  • Shell 7.1%
  • Roff 0.6%