#

infinity-var

Here is 1 public repository matching this topic...

Henvezz95 / VAR-Compressor

W4A4 and INT8 KV-cache quantization for Infinity VAR models. Optimized for high-fidelity generative AI deployment on edge GPUs (e.g. NVIDIA Jetson).

computer-vision pytorch gpu-acceleration quantization model-compression nvidia-jetson inference-optimization edge-ai on-device-ml weight-quantization post-training-quantization autoregressive-models generative-ai kv-cache-quantization activation-quantization visual-autoregressive-model svdquant infinity-var

Updated Apr 28, 2026
Python

Improve this page

Add a description, image, and links to the infinity-var topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the infinity-var topic, visit your repo's landing page and select "manage topics."