Pinned Loading
-
CASE-Lab-UMD/LLM-Drop
CASE-Lab-UMD/LLM-Drop PublicThe official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
-
CASE-Lab-UMD/Unified-MoE-Compression
CASE-Lab-UMD/Unified-MoE-Compression PublicThe official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".
-
CASE-Lab-UMD/Router-Tuning-Mixture-of-Depths
CASE-Lab-UMD/Router-Tuning-Mixture-of-Depths PublicThe open-source Mixture of Depths code and the official implementation of the paper "Router-Tuning: A Simple and Effective Approach for Enabling Dynamic Depth in Transformers. (EMNLP 2025)"
-
SparseUnifiedModel
SparseUnifiedModel PublicThe official implementation of the paper "Understanding and Harnessing Sparsity in Unified Multimodal Models".
Python 19
-
CASE-Lab-UMD/Capacity-Aware-MoE
CASE-Lab-UMD/Capacity-Aware-MoE PublicThe official implementation of the paper "Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts".
Python 11
If the problem persists, check the GitHub status page or contact support.


