Skip to content

Fix QuantizedLinear for non-affine quantization modes (mxfp4)#384

Merged
davidkoski merged 1 commit intoml-explore:mainfrom
Nicolas-nwb:fix/quantized-linear-mxfp4
Apr 6, 2026
Merged

Fix QuantizedLinear for non-affine quantization modes (mxfp4)#384
davidkoski merged 1 commit intoml-explore:mainfrom
Nicolas-nwb:fix/quantized-linear-mxfp4

Conversation

@Nicolas-nwb
Copy link
Copy Markdown
Contributor

Proposed changes

QuantizedLinear fails to load mxfp4-quantized models with .keyNotFound for "biases". Two bugs:

  1. Mode not forwarded during quantization — The weight-based init calls MLX.quantized(weight, groupSize:bits:) without passing mode, so weights are always quantized as affine regardless of the mode specified. QuantizedEmbedding already passes mode correctly.

  2. Missing biases treated as error for non-affine modesModule.update(parameters:, verify: .all) calls updateMissing() for the "biases" key, which throws .keyNotFound. Non-affine modes (e.g. .mxfp4) don't produce biases, so this key is legitimately absent.

Fix

  • Pass mode: mode to MLX.quantized() in QuantizedLinear.init(weight:bias:groupSize:bits:mode:)
  • Override updateMissing() to skip "biases" when mode != .affine

Tests

  • testQuantizedLinearMxfp4DoesNotCreateAffineBiases — verifies mxfp4 produces nil biases
  • testQuantizedLinearMxfp4ParametersRoundTripWithoutBiases — verifies update(parameters:, verify: .all) succeeds without a "biases" key

Checklist

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

Comment on lines +339 to +341
if parameter == "biases" && mode != .affine {
return
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious about this -- the init for QuantizedLinear should have left us with a biases = nil and we shouldn't hit this case. I wonder what happened here?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, perhaps this was an earlier attempt to fix it?

Image

this is from the test and biases is indeed nil. I think this method should be removed because the missing parameter will be handled correctly -- this is redundant and confuses the issue by suggesting we need special handling.

Copy link
Copy Markdown
Collaborator

@davidkoski davidkoski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please look at my comment on updateMissing() -- I think it should be removed, but let me know if you think otherwise.

Nice find and fix!

QuantizedLinear.init(weight:...) called MLX.quantized() without
forwarding the mode parameter, so weights were always quantized
as affine regardless of the specified mode. This produced spurious
biases for non-affine modes like mxfp4.

Fix: pass mode: mode to MLX.quantized(), matching QuantizedEmbedding
which already does this correctly.

Note: no updateMissing() override is needed — when biases is nil,
Module.build(value:) wraps it as .value(.other(...)), and the
(.value(.other(_)), .none) case in update() already breaks silently.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Nicolas-nwb Nicolas-nwb force-pushed the fix/quantized-linear-mxfp4 branch from b68356b to 15532c3 Compare April 5, 2026 20:16
@Nicolas-nwb
Copy link
Copy Markdown
Contributor Author

You're right — I traced through Module.update(parameters:verify:) and when biases is nil, build(value:) wraps Optional<MLXArray>.none as .value(.other(...)), which hits the (.value(.other(_)), .none) → break case in apply. So updateMissing is never actually called for a nil optional parameter. Removed the override and the round-trip test.

The mxfp4 nil-biases test stays to guard the mode: forwarding fix.

Thanks for the thorough review and for maintaining this project!

Copy link
Copy Markdown
Collaborator

@davidkoski davidkoski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good, thank you for figuring this out!

@davidkoski davidkoski merged commit 3b11207 into ml-explore:main Apr 6, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants