#

weight-quantization

Here are 2 public repositories matching this topic...

anilknayak / quantization

Weight quantization

python machine-learning deep-neural-networks deep-learning quantization quantization-algorithms weight-quantization quantization-algorithm quantization-by-clustering quantization-weight-clustering

Updated Aug 31, 2020
Python

Henvezz95 / VAR-Compressor

W4A4 and INT8 KV-cache quantization for Infinity VAR models. Optimized for high-fidelity generative AI deployment on edge GPUs (e.g. NVIDIA Jetson).

computer-vision pytorch gpu-acceleration quantization model-compression nvidia-jetson inference-optimization edge-ai on-device-ml weight-quantization post-training-quantization autoregressive-models generative-ai kv-cache-quantization activation-quantization visual-autoregressive-model svdquant infinity-var

Updated Apr 29, 2026
Python

Improve this page

Add a description, image, and links to the weight-quantization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the weight-quantization topic, visit your repo's landing page and select "manage topics."