Collecting Strix Halo / Ryzen AI MAX+ 395 local LLM results: llama.cpp Vulkan/RADV, Ollama, ROCm/HIP
Article automatically generated from technical news.
I’m collecting reproducible local LLM results for AMD Strix Halo / Ryzen AI MAX+ 395 systems. My current fastest direct llama.cpp row: - Hardware: Beelink GTR9 Pro - APU: Ryzen AI MAX+ 395 / Radeon 8060S - Memory: 128GB LPDDR5X unified memory - OS: Ubuntu 24.04 - Backend: llama.cpp Vulkan/RADV - Model: Qwen3-Coder 30B-A3B - Quant: Q4_K_S - Result: 98.51 t/s tg128, direct llama-bench Other rows I’m tracking: - Qwen3-Coder 30B-A3B UD-Q4_K_XL: 96.76 t/s tg128 - Qwen3.6 35
Fonte originale