Skip to content

[Vulkan] Implement vendor/architecture-specific matmul tuning #6

@goniz

Description

@goniz

Problem

MLX uses one default MLX_VULKAN_MATMUL_SPEC tuple unless manually overridden by env var. ggml has a much richer architecture/tuning matrix including vendor detection, subgroup characteristics, cooperative matrix probing, and many precompiled matmul/FA variants.

Tasks

  • Build vendor/architecture tuning table with:
    • matmul tile sizes
    • subgroup sizes
    • aligned vs unaligned variants
    • f16acc vs f32acc
    • split-K thresholds
    • small/medium/large kernel families
  • Add subgroup-size control and cooperative matrix capability detection
  • Add integer-dot-product support detection
  • Add architecture-specific kernel selection

Related

See Tier 2 item 1 in performance analysis report.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions