Skip to content

Development Roadmap (2026 H1) #651

@chenghuaWang

Description

@chenghuaWang

Here is the development roadmap for H1 2026. We will pin this roadmap in Issues, and most of our subsequent work will be updated in this roadmap within Issues. In MLLM's documentation, we will archive each version of the roadmap and provide some outlooks. Contributions and feedback are welcome.

Focus

  • pymllm for embodied robots/agents on Jetson Orin/Thor.
  • mllm's arm and NPU will still going on, supporting more models.
  • NPU AOT shape bucketing optimization

Model coverage

  • Gemm3n(with support of AltUp, Embedding and SWA)
  • Qwen3-VL 2B
  • Qwen3.5 0.8B/4B/9B

Kernels

  • GDN kernel for Qwen3.5
  • marlin kernel for pymllm (mllm-kernel)
  • GDN kernel for pymllm(mllm-kernel)

Pymllm

  • Radix Cache corretness
  • Qwen3 MoE and Qwen3.5 Optimization
  • Optimizing CPU busy loop

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions