Thank you for sharing this great paper and the novel MoE-LPR idea.
Currently, the repository depends on older versions of transformers and peft, which do not support Llama-3 training. Since the paper reports results on Llama-3, could you please update the submodules (or provide compatible instructions) to enable Llama-3 support in this repo?
Thanks in advance!