Qwen3.6-35B: Benchmarking a High-Capacity, Uncensored LLM with APEX and MTP Quantization
This article examines Qwen3.6-35B-A3B-Uncensored-Genesis, a powerful large language model characterized by its expansive context window, advanced quantization support (APEX, MTP), and uncensored deployment options. Testing indicates significant robustness in multi-tasking and long-context retention, making it a notable addition to the open-source LLM landscape.
Model Architecture and Core Features
The Qwen3.6-35B model represents a substantial advancement in accessible large language models. Its core design emphasizes flexibility and high performance, supported by multiple deployment formats. Key features include full uncensored capabilities, which is particularly relevant for specialized or research applications, and robust support for modern quantization techniques.
Quantization and Compatibility
The model is offered in several highly optimized formats to cater to diverse hardware environments:
- GGUF Format (MTP/APEX): The primary format for local deployment, supporting both APEX and APEX Compact quantization methods. The MTP quantization variant is specifically highlighted for performance testing.
- Safetensors Format: Provided for standard deployment, including an FP8 version, ensuring compatibility with various AI frameworks.
- Framework Support: The model includes Safetensors support for Apple MLX conversion, offering native deployment options for macOS users. Development is underway for MTP-Safetensors.
Performance and Context Window Resilience
Testing was conducted on dedicated hardware (Beelink gtr9 pro + Strix Halo) using the Q8_K_P - MTP quant variant. The results highlight exceptional resilience in long-context inference, a crucial metric for sophisticated AI applications.
Long-Context Stability
During five testing sessions utilizing a 200k context window, the model demonstrated exceptional stability. The tests reported zero instances of glitches, repetitive loops, or erroneous