Molecular Dynamics Simulation with GROMACS on AMD EPYC – ROME

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Check out other resources

Symptoms

Savitha Pareek, HPC and AI Innovation Lab, November 2019

AMD recently announced its 2^nd generation EPYC processors (codenamed "ROME") which support up to 64 cores, and DellEMC has just released High Performance Computing (HPC) servers designed from the ground up to take full advantage of these new processors. We have been evaluating applications on these servers in our HPC and AI innovation Labs, including the Molecular Dynamics Application – GROningen MAchine for Chemical Simulations (GROMACS) application and report our findings for GROMACS in this blog.

Resolution

GROMACS is a free and open-source parallel molecular dynamics package designed for simulations of biochemical molecules such as proteins, lipids, and nucleic acids. It is used by a wide variety of researchers, particularly for biomolecular and chemistry simulations. It supports all the usual algorithms expected from modern molecular dynamics implementation. It is open-source software with the latest versions available under the GNU Lesser General Public License (LGPL). The code is mainly written in C and makes use of both MPI and OpenMP parallelism.

This blog describes the performance of GROMACS on two-socket PowerEdge servers using the latest addition to the AMD® EPYC Rome processors listed in Table 1(a). For this study, we carried out all benchmarks on a single server equipped with two processors, running only a single job at a time on the server. We compared performance improvements on the 2^nd generation AMD EPYC Rome (7xx2 series) based PowerEdge servers with the previous generation DellEMC PowerEdge servers equipped with the 1^st generation AMD EPYC Naples (7xx1 series) processors listed in table 1(b).

Table 1(a)-ROME CPU models evaluated for single node study

CPU	Cores/Socket	Config	Base frequency	TDP
7742	64c	4c per CCX	2.25 GHz	225W
7702	64c	4c per CCX	2.0 GHz	200W
7502	32c	4c per CCX	2.5 GHz	180W
7452	32c	4c per CCX	2.35 GHz	155W
7402	24c	3c per CCX	2.8 GHz	180W

Table 1(b)- Naples CPU model evaluated for comparison

CPU	Cores/Socket	Config	Base Clock	TDP
7601	32c	4c per CCX	2.2 GHz	180W

Server configurations are included in Table 2(a), with the list of the benchmark data sets given in Table 2(b).

Table 2(a)-Testbed

Component	ROME Platform	NAPLES Platform
Processor	As shown in Table.1a	As shown in Table.1b
Memory	256 GB, 16x16GB 3200 MT/s DDR4	256 GB, 16x16GB 2400 MT/s DDR4
Operating System	Red Hat Enterprise Linux 7.6	Red Hat Enterprise Linux 7.5
Kernel	3.10.0.957.27.2.e17.x86_64	3.10.0-862.el7.x86_64
Application	GROMACS – 2019.2

Table 2(b)- Benchmark datasets used for GROMACS performance evaluation on ROME

Dataset	Details
Water Molecule	1536K and 3072K
HecBioSim	1400K and 3000K
Prace – Lignocellulose	3M

For this single node study, we compiled GROMACS version 2019.3, with the latest OPENMPI and FFTW, testing several different compilers, associated high-level compiler options and electrostatic field load balancing (i.e. PME, etc). We carried out two studies for our blog: our first study focused on the performance of the Rome based systems with hyperthreading enabled vs hyperthreading disabled; and our second study investigated the performance advantage obtained with Rome over the Naples system. For our Hyperthreading study, our Hyperthreading results were obtained by enabling Hyperthreading through the BIOS and adjusting the benchmarking parameters to run each benchmark with twice as many threads as the non-Hyperthreaded counterpart. As an example, for the 24-core based 7402 benchmarks, the non-Hyperthreaded single node used 48 threads (dual-processor server) and the Hyperthreaded results used 96 threads. Our results are presented in Figure 1.

SLN319583_en_US__1image(13018)
Figure 1. GROMACS performance evaluation with hyper-threading disabled vs hyper-threading enabled on ROME

For these benchmarks, the electrostatic field used was Particle Mesh Ewald (PME) for Water-1536K, Water-3072K, and the HECBIOSIM datasets (1.4M and 3M). We used the reaction field (RF) electrostatic force for the Lignocellulose_3M case.

While the performance gains observed (higher is better) with enabling Hyperthreading were varied both with respect to the different processors and data sets, they were consistently better than the non-Hyperthreaded baselines (1.0). GROMACS shows a clear performance boost with hyperthreading enabled across the ROME SKUs.

In the second study, we have compared the Rome based servers to the Naples based server, using Hyperthreading enabled for all tests based on the results from the first study. We have measured the relative performance w.r.t to Naples 7601 as baseline (1.0) with the other ROME SKUs. These results are shown in Figure 2.

SLN319583_en_US__2image(13019)

Figure 2. Performance evaluation across different AMD EPYC Generation Processors

Comparing the 32-core based servers (7551,7601,7452,7502), we observed a generational performance improvement of about 50%. The 24-core Rome based 7402, while lacking as many cores as the Naples systems, still managed to outperform the Naples based systems by about 20-40%, depending on the respective benchmark. The 64-core based (7702,7742) systems displayed close to a 250% increase in overall performance over the 32-core based Naples server. Overall, the Rome results, particularly with Hyperthreading enabled demonstrated a substantial performance improvement for GROMACS over Naples.

Conclusion

Dell EMC PowerEdge servers equipped with the AMD ROME processors offer significant single node performance gains over previous generation Naples counterparts for applications such as GROMACS. We found a strong positive correlation with overall system performance and processor core count and a weak correlation with processor frequency. The 64-core Rome processors delivered a sizable performance advantage over the 24-core and 32-core processors. We are in the processing of exploring how these single node performance gains (with and without Hyperthreading) will translate into multi-node performance gains for Molecular Dynamic applications on our new Minerva Cluster at the HPC and AI Innovation Lab. Watch this blog site for updates.

Affected Products

High Performance Computing Solution Resources

Article Number: 000134870

Article Type: Solution

Last Modified: 21 Feb 2021

Version: 3

Check if your device is covered by Support Services.

Molecular Dynamics Simulation with GROMACS on AMD EPYC – ROME

Symptoms

Resolution

Affected Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services

Welcome

Welcome to Dell

Molecular Dynamics Simulation with GROMACS on AMD EPYC – ROME

Detailed Article

Symptoms

Resolution

Affected Products

Symptoms

Resolution

Affected Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services