-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Description
KTransformers Roadmap - 2025 Q4
Focus
- Usability: Easy installation on x86 + NV GPU, more documents and FQA.
- Model Coverage: K2-Thinking INT4 native support, Qwen3 Finetune.
Usability
- Refactoring KTransformers structure. Refactor: restructure repository to focus on kt-kernel and KT-SFT modulesq recon #1581 @SkqLiao
- Docker start of sglang-kt inference. @SkqLiao
- Local quantization scripts. @ouqingliang
- Documents: installation guide, coverage infomation, contribution guide and FAQ. @SkqLiao @ErvinXie
Model Coverage
- K2 Thinking INT4 native support [Feature] Support Native K2 Thinking #1598 . @ouqingliang @chenht2022
- Support Qwen3 series finetune. @JimmyPeilinLi and community contributors. [Feature] KTransformers Fine-tuning Feature Compatibility & Key Enhancements Support Tracking #1575 add qwen3 attn #1602
- Qwen3 VL inference support, TDB.
Performance / Features
- Layerwise prefill. @chenht2022
- AVX2 Support for inference and finetune. @SkqLiao @JimmyPeilinLi
- AMD Adaption @KMSorSMS [Feature] Support amd blis optimizatioin #1601
- EPLB for inference @chenht2022
Contribution / Maintenance
- Build CI workflow for kt-kernel. TBD
- Merge KT-SFT and kt-kernel @JimmyPeilinLi
- Bi-weekly Office Hour @KMSorSMS
Any contribution is welcomed, please email ervinxie@qq.com if you want to join development WeChat group.
james0zan, sz-xuejin, heyguyspj, mrgaolei and lin72h
Metadata
Metadata
Assignees
Labels
No labels