fix: correct KV cache memory stats for K/V metadata and fp16 baseline by dipeshbabu · Pull Request #53 · TheTom/turboquant_plus

dipeshbabu · 2026-03-31T07:21:57Z

Fix KVCacheCompressor.memory_stats() so it matches the actual storage layout used by compress().

The previous implementation undercounted memory in two ways:

it treated the original fp16 baseline as a single tensor instead of combined K+V
it omitted stored norms from the compressed accounting, including the V-side norm and one of the K-side norms

What changed

Added an exact regression test in tests/test_kv_cache.py that checks the byte math for:

Also verified the full KV cache test file passes.

TheTom · 2026-04-02T01:17:38Z

hey there. thank you for the contribution. i'll be getting to them. I apologize for the delay.

fix: correct KV cache memory stats accounting

e889b2c