-
Notifications
You must be signed in to change notification settings - Fork 175
Open
Description
Here is the development roadmap for H1 2026. We will pin this roadmap in Issues, and most of our subsequent work will be updated in this roadmap within Issues. In MLLM's documentation, we will archive each version of the roadmap and provide some outlooks. Contributions and feedback are welcome.
Focus
- pymllm for embodied robots/agents on Jetson Orin/Thor.
- mllm's arm and NPU will still going on, supporting more models.
- NPU AOT shape bucketing optimization
Model coverage
- Gemm3n(with support of AltUp, Embedding and SWA)
- Qwen3-VL 2B
- Qwen3.5 0.8B/4B/9B
Kernels
- GDN kernel for Qwen3.5
- marlin kernel for pymllm (mllm-kernel)
- GDN kernel for pymllm(mllm-kernel)
Pymllm
- Radix Cache corretness
- Qwen3 MoE and Qwen3.5 Optimization
- Optimizing CPU busy loop
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels