Comparative Analysis of High-Performance AI Compute Platforms: M5, DGX Spark, Strix Halo, and RTX 6000

A detailed, multi-day performance benchmark comparing the compute capabilities of modern Apple Silicon (M5), specialized accelerators (DGX Spark), high-end portable systems (Strix Halo), and dedicated professional GPUs (RTX 6000). The findings highlight the critical role of memory bandwidth in scaling LLM performance and note thermal constraints under sustained high-load AI workloads.

Experimental Setup and Methodology

This analysis is based on a comprehensive set of standardized tests run over a three-day period, pitting four distinct hardware architectures against each other in a controlled environment with optimized power and cooling. The comparative platforms included the M5 Mac, DGX Spark, Strix Halo, and the RTX 6000. The goal of the experiment was to quantify token generation throughput across diverse, high-demand AI tasks.

Performance Correlation: Memory Bandwidth vs. Throughput

Initial observations indicated a strong correlation between memory bandwidth and achieved tokens per second (TPS). The RTX 6000 demonstrated a memory bandwidth of approximately 1,800 GB/s, significantly higher than the M5 at roughly 600 GB/s, and substantially higher than the Spark and Strix Halo (each noted at approximately 256 GB/s). This fundamental hardware difference directly translated into the observed performance curves for tokens per second across all tested platforms.

Platform Performance Deep Dive

M5 Performance and Ecosystem Value

When evaluating the cost-efficiency and raw power of the M5 platform, it was noted that, assuming an ecosystem-agnostic approach, the maximum configured M5 unit aggressively outperforms the DGX Spark. This superior performance is directly attributable to the M5's memory architecture, specifically possessing over double the memory bandwidth of the DGX Spark while maintaining a unified memory structure.

Thermal Management Under Sustained Load

A critical observation pertains to the thermal performance of the devices during extended, high-intensity AI workloads. While the EVO X2 thermals presented challenges, the M5 MacBook Pro performed surprisingly well, maintaining operation within the 80°C range over several days. However, it is crucial to note that under extreme load, the M5, like many laptops engaged in local AI processing, exhibits significant thermal output, functioning effectively as a high-power heat source. The notion of "quiet" operation when running intense local AI tasks should be contextualized against its high power consumption.

Future Research and Data Availability

The current data provides a foundational comparison of peak performance metrics. Ongoing research is focused on expanding the dataset by integrating various software backends, such as MLX on Mac, and different hosting environments on the Strix Halo, to determine how software stack optimization impacts observed performance and output quality. It is worth noting that while this benchmark utilizes the RTX 6000, findings exhibit similarities with other high-end professional GPUs, which may be relevant for users comparing options like the RTX 5090 PC build.

The raw data and methodology used in these tests are publicly available for further analysis and debate.

Original Source: M5 vs DGX Spark vs Strix Halo vs RTX 60