This article was written by Martin Feyereisen and Joshua Weage. HPC and AI Innovation Lab, October 2019
This blog discusses the performance of OpenFOAM®, a popular Computational Fluid Dynamics (CFD) application on the Dell EMC Ready Solutions for HPC Digital Manufacturing with AMD EPYC™ 7002 series processors. This Dell EMC Ready Solutions for HPC was designed and configured specifically for Digital Manufacturing workloads, where Computer Aided Engineering (CAE) applications are critical for virtual product development. The Dell EMC Ready Solutions for HPC Digital Manufacturing uses a flexible building block approach to HPC system design, where individual building blocks can be combined to build HPC systems which are optimized for customer specific workloads and use cases.
The Dell EMC Ready Solutions for HPC Digital Manufacturing is one of many solutions in the Dell EMC HPC solution portfolio. Please visit www.dellemc.com/hpc for a comprehensive overview of the HPC solutions offered by Dell EMC.
Performance benchmarking was performed using both 7001 and 7002 series AMD EPYC processors. The system configurations used for the performance benchmarking are shown in Table 1 and Table 2. All servers were equipped with two processors. The BIOS configuration used for the benchmarking systems is shown in Table 3.
Table 1: 7001 Series AMD EPYC System Configuration | |
---|---|
Server |
Dell EMC PowerEdge R7425 |
Processors |
AMD EPYC 7451 24-core Processor (x2) AMD EPYC 7601 32-core Processor (x2) |
Memory |
16x16GB 2400 MTps RDIMMs |
BIOS Version |
1.10.6 |
Operating System |
Red Hat Enterprise Linux Server release 7.5 |
Kernel Version |
3.10.0-862.el7.x86_64 |
Table 2: 7002 Series AMD EPYC System Configuration | |
---|---|
Server |
Dell EMC PowerEdge C6525 |
Processors |
AMD EPYC 7702 64-Core Processor (x2) AMD EPYC 7502 32-Core Processor (x2) AMD EPYC 7402 24-Core Processor (x2) |
Memory |
16x16GB 3200 MTps RDIMMs |
BIOS Version |
1.0.1 |
Operating System |
Red Hat Enterprise Linux Server release 7.6 |
Kernel Version |
3.10.0-957.27.2.el7.x86_64 |
Table 3: BIOS Configuration | |
---|---|
System Profile |
Performance Optimized |
Logical Processor |
Disabled |
Virtualization Technology |
Disabled |
NUMA Nodes Per Socket |
4 (C6525) |
Application software versions are as described in Table 4.
Table 4: Software Version | |
---|---|
OpenFOAM |
Version 7 with OpenMPI 3.1.4 |
OpenFOAM is a C++ toolbox used primarily for Computational Fluid Dynamics (CFD) analysis. For this paper, we used version 7 from the OpenFOAM Foundation (www.openfoam.org). We built OpenFOAM using the GCC compiler version 4.8.5 with the default compile options supplied with version 7. Additionally, we built and utilized OpenMPI version 3.1.4. For these benchmarks we built both single-precision (SP) and double-precision (DP) versions.
We ran benchmarks using datasets supplied with the code distribution in the ‘tutorials’ sub-directory. These benchmark cases are described in Table 5.
Table 5: Benchmark Cases | |||
---|---|---|---|
Case |
File location |
Mesh size |
Solution time |
cyclone |
$FOAM_TUTORIALS/lagrangian/MPPICFoam/cyclone |
5M |
0.01 |
propeller |
$FOAM_TUTORIALS/multiphase/interPhaseChangeFoam/propeller |
6M |
0.001 |
dam |
$FOAM_TUTORIALS/multiphase/multiphaseEulerFoam/damBreak4phaseFine |
15M |
0.1 |
motorbike |
$FOAM_TUTORIALS/incompressible/simpleFoam/motorBike |
34M |
2500 |
wedge |
$FOAM_TUTORIALS/discreteMethods/dsmcFoam/wedge15Ma5 |
50M |
0.003 |
corner |
$FOAM_TUTORIALS/discreteMethods/dsmcFoam/supersonicCorner |
80M |
0.001 |
Figure 1 shows the measured performance for double precision OpenFOAM on the single server systems described in Tables 1 and 2 with the benchmark cases described in Table 5.
Figure 1: OpenFOAM 7 DP Single Server Performance
For these benchmarks the performance is measured in terms of the wall clock time required for the main solver section in the simulation (higher is better). The results in are plotted relative to the performance of the AMD 7451 based server.
These results show the performance advantage available with 7002 series AMD EPYC processors is noticeable but varies with respect to the individual benchmarks. In general, the overall benefit of the Rome based systems diminishes with the increase in the size of the benchmark case, particularly with respect to the performance advantage of the system based with the two 64-core 7702 AMD Rome processors. This is likely due to the ability of the server to more effectively fit the data of the smaller benchmarks in processor cache reducing the reliance of the main memory for data access. This "cache effect" preferentially benefits processors with more cores, such as the 64-core 7702 processor. Additionally, this cache effect is often more noticeable as a problem is spread across multiple nodes for a simulation. Distributing a larger dataset across multiple nodes is similar to running a smaller problem on a single node since the model data can be spread across all of the nodes used in the simulation.
Figure 2 is similar to Figure 1 but shows the performance with the single-precision OpenFOAM version.
Figure 2: OpenFoam 7 SP Single Server Performance
The ‘wedge’ and ‘corner’ benchmarks were not compatible with the single-precision version. The single-precision benchmarks display a significantly larger performance benefit for the Rome processors over Naples than the double-precision benchmarks. These results are similar to what we have seen from other CFD applications. The 24-core and 32 core Rome based servers demonstrated noticeable improvements over their Naples counterparts, and the 64-core based 7702 Rome processor delivers excellent system performance.
The results presented in this blog show that 7002 series AMD EPYC processors offer a significant performance improvement for OpenFOAM relative to 7001 series AMD EPYC processors.