Qwen 3.6 27B Speculative Decoding Bench: Pushing ~100 TPS on a single RTX 3090

u/old-mike 2026-06-30 · 12:40 UTC

A benchmark achieved with Qwen 3.6 27B on a single RTX 3090 reached approximately 100 TPS, demonstrating significant performance gains. The study compared multiple inference engines on a high-end GPU setup. Results highlight the model's efficiency and scalability. For details, visit the original source.