[WIP] fix qparams decompression #514

shanjiaz · 2025-11-10T21:22:09Z

Updated the decompress_weight function to unpack zero_point/cast scale dtype during decompression. Replace the tensor in module with updated one.

Example script now generates coherent result:

(llm-compressor) [shanjiaz@nma-a100-solo-4-preserve llm-compressor]$ python zp_decompression.py 
`torch_dtype` is deprecated! Use `dtype` instead!
Compressing model: 154it [00:00, 747.12it/s]



========== SAMPLE GENERATION ==============
<s> Hello my name is John and I am a software engineer. I have been working in the tech industry for the past 10 years. I have worked on various projects and have gained a lot of experience. I am passionate about technology and have a keen interest in the latest technologies. I have a bachelor's degree in computer science and have completed several certifications in various technologies. I am currently working as a software engineer at a leading technology company. In my free time, I enjoy
==========================================

Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>

dsikka

Just an fyi: #509
This will also impact mxfp4.
I've turned off mxfp4 decompression in the meantime / lower priority anyway

dsikka

Would it be cleaner to add optional:
compress_scale / decompress_scale
and compress_zp / decompress_zp functions?

This would impact:

PackedCompressor (packed zp)
NVFP4PackedCompressor (fp8 scales)
MXFP4PackedCompressor (uint8 scales)

shanjiaz · 2025-11-17T21:12:44Z

Would it be cleaner to add optional: compress_scale / decompress_scale and compress_zp / decompress_zp functions?

This would impact:

PackedCompressor (packed zp)

NVFP4PackedCompressor (fp8 scales)

MXFP4PackedCompressor (uint8 scales)

Sure! I can do that.

Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>

shanjiaz added 3 commits November 10, 2025 21:21

fix qparams decompression

98917b7

Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>

quality

70e0838

Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>

quality

e7c8bec

Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>

shanjiaz added the bug Something isn't working label Nov 10, 2025

dsikka and others added 2 commits November 10, 2025 17:41

Merge branch 'main' into fix-qparams-decompression

b8ceadb

Merge branch 'main' into fix-qparams-decompression

71f34d7

dsikka mentioned this pull request Nov 17, 2025

[MXFP4] Add calibration support #509

Open

dsikka reviewed Nov 17, 2025

View reviewed changes

shanjiaz and others added 2 commits November 18, 2025 19:38

Add zero-point compression for asymmetric quantization

ac326ee

Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>

Merge branch 'main' into fix-qparams-decompression

a86c657

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] fix qparams decompression #514

[WIP] fix qparams decompression #514

Uh oh!

shanjiaz commented Nov 10, 2025 •

edited

Loading

Uh oh!

dsikka left a comment

Uh oh!

dsikka left a comment •

edited

Loading

Uh oh!

shanjiaz commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[WIP] fix qparams decompression #514

Are you sure you want to change the base?

[WIP] fix qparams decompression #514

Uh oh!

Conversation

shanjiaz commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsikka left a comment

Choose a reason for hiding this comment

Uh oh!

dsikka left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shanjiaz commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

shanjiaz commented Nov 10, 2025 •

edited

Loading

dsikka left a comment •

edited

Loading