anilknayak / quantization Star 2 Code Issues Pull requests Weight quantization python machine-learning deep-neural-networks deep-learning quantization quantization-algorithms weight-quantization quantization-algorithm quantization-by-clustering quantization-weight-clustering Updated Aug 31, 2020 Python
Henvezz95 / VAR-Compressor Star 0 Code Issues Pull requests W4A4 and INT8 KV-cache quantization for Infinity VAR models. Optimized for high-fidelity generative AI deployment on edge GPUs (e.g. NVIDIA Jetson). computer-vision pytorch gpu-acceleration quantization model-compression nvidia-jetson inference-optimization edge-ai on-device-ml weight-quantization post-training-quantization autoregressive-models generative-ai kv-cache-quantization activation-quantization visual-autoregressive-model svdquant infinity-var Updated Apr 29, 2026 Python