AMD MI300X outperforms NVIDIA H100 in LLM inference benchmarks.

Tensorwave has released benchmarks comparing the AMD MI300X and NVIDIA H100 in LLM Inference AI workloads, showing a 3x performance improvement with the AMD accelerator. The tests were conducted using the Mixtral 8x7B model, with AMD’s setup running ROCm 6.12 drivers and NVIDIA’s setup running CUDA 12.2 drivers. In offline performance tests, the AMD MI300X outperformed the NVIDIA H100 by up to 194%. In online tests simulating chat applications, the MI300X offered 33% more requests per second and maintained a 5-second average latency compared to the H100. Despite NVIDIA’s recent software optimizations, the MI300X still showed superior performance and value. Tensorwave praised the MI300X for its high performance, competitive cost, and availability, recommending it for enterprises looking to scale their AI inference capabilities. The CEO of Tensorwave highlighted the MI300X as a superior option to the H100, noting its availability compared to the booked-out status of the H100. Overall, the benchmarks demonstrate the MI300X’s superiority in both offline and online inference tasks for MoE architectures like Mixtral 8x7B, making it a compelling choice for AI workloads.

Source link

Source link: https://wccftech.com/amd-mi300x-3x-faster-nvidia-h100-llm-inference-ai-benchmarks-competitive-pricing/amp/