Qwen-27B-IQ4_KS for ik_llama.cpp, especially for NVIDIA with 16GB VRAM
Hi everyone, I'm presenting a new quantization of the Qwen-27B model, created specifically with 16GB VRAM NVIDIA GPUs in mind. I used quants that, unfortunately, are not yet available in the main ups
β View original source