Customized repo forked from TransformerCompression of the paper SliceGPT (ICLR'24).
$ virtualenv env
$ source env/bin/activate
$ pip install -r requirements.txtSet argparse.Namespace with the local configuration file rather than the bash shell as the original repo did .
To set the experiment arguments, edit the desired configurations (args) in configs/*.yaml. The configurations setting rules for different purposes are as follows:
Leave model-path and sliced-model-path blank and fill model with the model name in huggingface.
Additional models supports on the top of Supported models:
- meta-llama/Llama-3.2-1B-Instruct
- meta-llama/Llama-3.2-3B-Instruct
- meta-llama/Llama-3.1-8B-Instruct
- microsoft/Phi-3.5-mini-instruct
- microsoft/phi-4
Supported datasets: wikitext2, ptb, alpaca.
The same args as slicing, except for setting model-path the same as model. Then in experiments/run_slicegpt.py:
kwargs = prepare_slicing(slicing_args)
slicing_main(slicing_args, kwargs)Except for the args of unsliced setting, set sliced-model-path with the local path that stores the sliced model.
prepare_slicing(slicing_args)TBC.