Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products

NAMD Performance on PowerEdge R740 with Volta GPUs

Summary: PowerEdge R740, NAMD, NVIDIA V100-PCIe GPUs, HPC, High Performance Computing, HPC and AI Innovation Lab, Performance

This article applies to   This article does not apply to 

Symptoms

Article written by Deepthi Cherlopalle and Frank Han of HPC and AI Innovation Lab in September 2018

Cause

 

Resolution

There are a number of HPC applications that can benefit from using NVIDIA GPUs and NAMD is one such application. Nanoscale Molecular Dynamics (NAMD) is an open-source software for molecular dynamics simulation written in a CHARMM parallel programming model and is designed for high-performance simulation of large biomolecular systems. To benefit from GPU acceleration, a CUDA build for NAMD is needed. Instructions to compile NAMD with CUDA support can be found here.

In this article we are going to discuss the NAMD performance with NVIDIAs latest Volta GPU V100-32GB in a single Dell EMC PowerEdge R740 server.

The PowerEdge R740 is a 2U dual socket server with Intel Skylake processors. It can be configured to support up to 3 double wide GPUs and a high speed networking adapter.

Table 1 shows information about the hardware configuration and application details used for the tests. A nightly build version of NAMD was compiled for this test dated 08-17-2018.

       
Table 1 Hardware and Software Configuration Details
Server PowerEdge R740
Processor 2 * Intel Xeon 6148 – 20 core processor @ 2.4GHz
Memory    192GB @ 2666MT/s
GPU   3 * NVIDIA Volta V100-PCIe(32GB)
Power Supply  2*1600W
Operating System RHEL 7.4
Kernel: 3.10.0-693.el7.x86_64
BIOS Options System Profile – Performance
Logical processor – Disabled
Turbo mode – Enabled
CUDA Version and Driver CUDA 9.2 (396.26)
NAMD Git-2018-08-17_Source (Nightly Build version), multi-core build
Compiler Intel 2018 u3

 

 


Performance Results

NAMD was tested with 3 different datasets: ApoA1, F1ATPase and STMV which consist of 92K, 327K and 1066k atoms respectively. Apoa1 is a smaller dataset compared to F1ATPase and STMV. The performance metric here is "ns/days". The data shown in this section is based on the average of 10 tests.
Figure 1 shows the NAMD performance with CPU and multiple GPUs on three datasets. Instructions from Intel’s website were followed to compile NAMD for Intel Xeon processors. Performance improvement from CPU to GPUs is noted in the graph below.

SLN313837_en_US__1graph1
                                                                         Figure 1 NAMD performance on CPU and GPUs

  •     Performance is relative to the Dual-CPU results across the different datasets.
  •     The GPU versions of NAMD provide up to 11.2x speedup compared to the CPU version for all three data sets.
  •     An additional 23%-33% performance increase was measured when the 2nd GPU card was added. A more modest 6-9% was observed from 2 GPU to 3 GPU tests.
     

    NAMD was also tested with different numbers of CPU cores across 1, 2 & 3 GPUs to identify the minimum number of CPU cores needed for best performance.

    SLN313837_en_US__2graph2

                                                          Figure 2 STMV performance with different CPU core counts and GPUs

  •     As seen in Figure 2 using too few CPU cores has a negative impact on performance and performance improves as we increase the number of cores
  •     Testing one V100 with 20+ CPU cores achieved good performance. There’s an additional 2-3% performance advantage when we use more than 20 cores.
  •     This test was also performed on the other two datasets ApoA1 and F1ATPase. Similar performance behavior was observed with them.
     


     


    Summary

    NAMD performance results on Dell EMC PowerEdge R740 server and NVIDIA Volta V100-32GB GPUs have been presented here. There is ~6.6x-11.2x speedup over CPU-only tests when using GPUs. A R740 server with 3 V100 cards provides the best performance while a R740 with 2 V100 cards is within 91% of the 3 GPU configuration. In case of budget limitations one GPU per R740 would be also be a good choice as it gives up to 8x speedup compared to CPU. Tests with varied CPU cores and GPUs were also conducted, and we noticed that using too few CPU cores (less than 20 cores in our tests) reduces NAMD performance on GPU tests.



     



     

     

Affected Products

High Performance Computing Solution Resources, PowerEdge R740