Skip to content

fix: V-norm in memory_stats, SeedSequence PRNG, streaming API, serialization#61

Open
brosequist wants to merge 1 commit intoTheTom:mainfrom
brosequist:fix/memory-stats-seedseq-streaming
Open

fix: V-norm in memory_stats, SeedSequence PRNG, streaming API, serialization#61
brosequist wants to merge 1 commit intoTheTom:mainfrom
brosequist:fix/memory-stats-seedseq-streaming

Conversation

@brosequist
Copy link
Copy Markdown

Summary

  • KVCacheCompressor.memory_stats() omitted the 32-bit float norm stored per V vector, inflating the reported compression ratio. Adds v_bits_total += n_vectors * 32.
  • Adds compressed_size_bits() to TurboQuantMSE (was missing; TurboQuant already had it).
  • Replaces seed + 1000 offset with np.random.SeedSequence(seed).spawn(2) for true PRNG independence between the PolarQuant and QJL stages.
  • Adds compress_token() / get_compressed_cache() streaming API to KVCacheCompressor for auto-regressive token-by-token inference.
  • Adds CompressedVector.to_bytes() / from_bytes() for disk/network serialization.

Test plan

  • pytest tests/test_kv_cache.py covers all new paths including memory_stats accuracy, streaming API, and serialization round-trip.

🤖 Generated with Claude Code

…essed_size_bits

KVCacheCompressor.memory_stats() omitted the float32 norm stored per V vector,
inflating the reported compression ratio. Add v_bits_total += n_vectors * 32 to
account for it. Also adds compressed_size_bits() to TurboQuantMSE (was missing;
TurboQuant already had it), fixing the asymmetry between the two classes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@TheTom
Copy link
Copy Markdown
Owner

TheTom commented Apr 2, 2026

hey there. thank you for the contribution. i'll be getting to them. I apologize for the delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants