NVIDIA Continues To Sweep AI Training Benchmarks With Hopper & Ampere GPUs
Press Release: In industry-standard tests of AI training, NVIDIA H100 Tensor Core GPUs set world records on enterprise workloads; A100 raised the bar in high-performance computing. Two months after their debut sweeping MLPerf inference benchmarks, NVIDIA H100 Tensor Core GPUs set world records across enterprise AI workloads in the industry group’s latest tests of AI training. Together, the results show H100 is the best choice for users who demand the utmost performance when creating and deploying advanced AI models. NVIDIA H100 GPUs (aka Hopper) set world records for training models in all eight MLPerf enterprise workloads. They delivered up to 6.7x more performance than previous-generation GPUs when they were first submitted on MLPerf training. By the same comparison, today’s A100 GPUs pack 2.5x more muscle, thanks to advances in software. A100 GPUs Hit New Peak in HPC In the separate suite of MLPerf HPC benchmarks, A100 GPUs swept all tests of training AI models in demanding scientific workloads run on supercomputers. The results show the NVIDIA AI platform’s ability to scale to the world’s toughest technical challenges. For example, A100 GPUs trained AI models in the CosmoFlow test for astrophysics 9x faster than the best results two years ago in the first round of MLPerf HPC. In that same workload, the A100 also delivered up to a whopping 66x more throughput per chip than an alternative offering. The HPC benchmarks train models for work in astrophysics, weather forecasting, and molecular dynamics. They are among many technical fields, like drug discovery, and adopting AI to advance science.
Supercomputer centers in Asia, Europe, and the U.S. participated in the latest round of the MLPerf HPC tests. In its debut on the DeepCAM benchmarks, Dell Technologies showed strong results using NVIDIA A100 GPUs. An Unparalleled Ecosystem In the enterprise AI training benchmarks, a total of 11 partners, including the Microsoft Azure cloud service, made submissions using NVIDIA A100, A30, and A40 GPUs. System makers including ASUS, Dell Technologies, Fujitsu, GIGABYTE, Hewlett Packard Enterprise, Lenovo, and Supermicro used a total of nine NVIDIA-Certified Systems for their submissions. For example, submissions in the latest HPC tests applied a suite of software optimizations and techniques described in a technical article. Together they slashed runtime on one benchmark by 5x, to just 22 minutes from 101 minutes. In the latest round, at least three partners joined NVIDIA in submitting results on all eight MLPerf training workloads. That versatility is important because real-world applications often require a suite of diverse AI models.