Local AI
INT4 Quantization
A quantization level that represents model weights as 4-bit integers, dramatically reducing VRAM and RAM usage. INT4 allows models like Llama 3 70B to run on a single consumer GPU with modest quality trade-offs.
Local AI
A quantization level that represents model weights as 4-bit integers, dramatically reducing VRAM and RAM usage. INT4 allows models like Llama 3 70B to run on a single consumer GPU with modest quality trade-offs.