The official results for the MLPerf Inference v2.0 benchmark tests are in! In this round of testing, five different Dell PowerEdge servers ranked number one in their respective categories. That feat is particularly noteworthy because the Dell servers had to beat out a lot more competitors than in past rounds of testing. Compared to the v1.1 of the tests, teams submitted twice as many testing results in performance categories and six times as many results in the power categories. Dell had 187 results in the closed division competing against 2,156 different submissions.
The MLPerf Inference Benchmark
From the beginning, the MLPerf benchmarks focused on replicating real-world use cases like image recognition, object detection, speech-to-text, natural language processing and recommendation engines.
“We knew that to drive progress in machine learning, we needed benchmarks that pushed on the frontier between research and industrial practice and that creating large-scale open data sets would be critical to shifting this line over time,” explains MLCommons. “To democratize these newfound technological capabilities and ensure wide adoption, we needed to reduce friction and improve ML portability so that we could share best practices across the boundaries between countries, between academia and industry, and between researchers and engineers in companies.”
Today, MLCommons continues to oversee MLPerf benchmark tests and validate the results submitted. While some other benchmark tests allow tests to take place on a rolling basis, MLCommons invites participants to submit results in various rounds of testing and then publishes accepted results as a set. All results must adhere to the organization’s standards for testing, and they must pass compliance tests and peer review before being accepted by MLCommons.
Because systems must meet such high standards to be accepted by MLCommons, the industry has generally recognized the MLPerf tests as one of the fairest ways to compare the performance of various ML systems. And customers rely on these results as a way to gauge which systems offer the fastest performance and which offer the most energy efficient performance for ML use cases.
Dell’s Performance on MLPerf v2.0
In this round of testing, Dell PowerEdge servers ranked particularly well when compared to competing systems in six different categories:
#1 in Performance Per Accelerator with NVIDIA A100 GPUs1
When compared to other systems featuring NVIDIA A100 GPUs, the Dell PowerEdge XE8545 and PowerEdge R750xa outshone all the competition. These NVIDIA GPUs are particularly popular with systems designed for deep learning workloads, making this a coveted claim. The test results also span a wide range of use cases, including image classification, object detection, speech-to-text, medical imaging, natural language processing and recommendation engines.
The PowerEdge XE8545 is a 2-socket, 4U server with a high AMD EPYC™ CPU core count and NVIDIA A100 GPUs mounted to the motherboard (SXM4) and NVIDIA NVLink for high-speed communication between GPUs. The PowerEdge R750xa is a 2x Intel® Xeon® Scalable CPU, 2U server designed to boost performance for GPU-intensive workloads with PCIe and NVIDIA NVLink Bridge to speed communication between GPUs.
#1 in System Performance for Multiple Benchmarks Among all PCIe-based 4-GPU Servers2
The Dell PowerEdge R750xa also bested the competition among PCIe-based 4-GPU servers. These benchmark results apply to image classification, object detection, speech-to-text, natural language processing and recommendation engines.
#1 for Lowest Multi-stream Latency with MIG Instance in Edge3
This round of tests included results from multi-instance GPU (MIG) systems. The Dell PowerEdge XE8545 topped this category, specifically the edge computing category as well. These test results apply to image classification and object detection use cases.
#1 for Highest T4 Inference Results4
The Dell PowerEdge XE2420 is a 2-socket, 2U edge server with a short-depth form factor. When compared to other systems using NVIDIA T4 GPUs, this server performed the best overall. These results relate to image classification, speech-to-text, and recommendation engine use cases. The results also help answer questions about NVIDIA T4 vs. A2, T4 v. A30 and T4 vs. A100 MIG performance and power metrics.
#1 for Highest Performance Per Watt with NVIDIA A2 GPU Results5
Finally, the Dell PowerEdge XR12 is a is a ruggedized, marine compliant 2U server with a small form factor. It’s suitable for telecommunications, military, retail, and restaurant deployments, as well as for other challenging environments where high power efficiency is desirable. These results make it a good choice for image classification, object detection, speech-to-text, natural language processing, and recommendation engines.
More information about Dell’s performance on the latest round of MLPerf benchmark tests is available on the Dell InfoHub.
1 CLM-004860 Based on testing submissions using the Inference Datacenter v2.0 benchmark test on April 6th, 2022 at Dell HPC & AI Innovation Lab. Per accelerator is calculated by dividing the primary metric of total performance by the number of accelerators reported. MLPerf v2.0 Inference Datacenter Closed. Submission ID: 2.0-015, 2.0-018, 2.0-019, 2.0-020. MLPerf name and logo are trademarks. See www.mlcommons.org for more information.
2 CLM-004858 Based on testing submissions using the Inference Datacenter v2.0 benchmark test on April 6th, 2022 at Dell HPC & AI Innovation Lab. Per accelerator is calculated by dividing the primary metric of total performance by the number of accelerators reported. Actual performance may vary depending on product configuration. MLPerf v2.0 Inference Datacenter v2.0 Closed. Submission ID: 2.0-015, 2.0-018, 2.0-019, 2.0-020. MLPerf name and logo are trademarks. See www.mlcommons.org for more information.
3 CLM-004854 Based on testing submissions using the Inference Datacenter v2.0 benchmark test on April 6th, 2022 at Dell HPC & AI Innovation Lab. Per accelerator is calculated by dividing the primary metric of total performance by the number of accelerators reported. Actual performance may vary depending on product configuration. MLPerf v2.0 Inference Edge v2.0 Closed. Submission ID: 2.0-026. MLPerf name and logo are trademarks. See www.mlcommons.org for more information.
4 CLM-004865 Based on testing submissions using the Inference Datacenter v2.0 benchmark test on April 6th, 2022 at Dell HPC & AI Innovation Lab. Per accelerator is calculated by dividing the primary metric of total performance by the number of accelerators reported. Actual performance may vary. MLPerf v2.0 Inference Datacenter v2.0 Closed. Submission ID: 2.0-016, 2.0-017, MLPerf name and logo are trademarks. See www.mlcommons.org for more information.
5 CLM-004872 Based on testing submissions using the Inference Datacenter v2.0 benchmark test on April 6th, 2022 at Dell HPC & AI Innovation Lab. Per accelerator is calculated by dividing the primary metric of total performance by the number of accelerators reported. Actual performance may vary depending on product configuration. MLPerf v2.0 Inference Datacenter v2.0 Closed. Submission ID: 2.0-021, 2.0-022. MLPerf name and logo are trademarks. See www.mlcommons.org for more information.