Skip to content

Conversation

@shanjiaz
Copy link
Collaborator

@shanjiaz shanjiaz commented Nov 10, 2025

Updated the decompress_weight function to unpack zero_point/cast scale dtype during decompression. Replace the tensor in module with updated one.

Example script now generates coherent result:

(llm-compressor) [shanjiaz@nma-a100-solo-4-preserve llm-compressor]$ python zp_decompression.py 
`torch_dtype` is deprecated! Use `dtype` instead!
Compressing model: 154it [00:00, 747.12it/s]



========== SAMPLE GENERATION ==============
<s> Hello my name is John and I am a software engineer. I have been working in the tech industry for the past 10 years. I have worked on various projects and have gained a lot of experience. I am passionate about technology and have a keen interest in the latest technologies. I have a bachelor's degree in computer science and have completed several certifications in various technologies. I am currently working as a software engineer at a leading technology company. In my free time, I enjoy
==========================================

Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
@shanjiaz shanjiaz added the bug Something isn't working label Nov 10, 2025
Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an fyi: #509
This will also impact mxfp4.
I've turned off mxfp4 decompression in the meantime / lower priority anyway

Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be cleaner to add optional:
compress_scale / decompress_scale
and compress_zp / decompress_zp functions?

This would impact:

  • PackedCompressor (packed zp)
  • NVFP4PackedCompressor (fp8 scales)
  • MXFP4PackedCompressor (uint8 scales)

@shanjiaz
Copy link
Collaborator Author

Would it be cleaner to add optional: compress_scale / decompress_scale and compress_zp / decompress_zp functions?

This would impact:

  • PackedCompressor (packed zp)
  • NVFP4PackedCompressor (fp8 scales)
  • MXFP4PackedCompressor (uint8 scales)

Sure! I can do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants