Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products

PowerEdge: Intel Cluster-On-Die(COD) technology on VMware ESXi

Summary: This article talks about the Intel Cluster on Die technology as it pertains to VMware ESXi.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Instructions

Introduction

    In Non-uniform memory access (NUMA) enabled systems, memory channels are distributed across the processors. All memory-related operations require snoop operations in order to maintain cache data coherency. Snooping is used to probe the content of cache on both local and remote processors to find the copy of requested data resides in any of caches. If NUMA is disabled (Node interleaving enabled in BIOS), then snoop mode is disabled automatically.

        There are three types of snoop mode available in Intel Haswell microarchitecture. Dell 13th generations of servers (13G) support all three snoop modes such as:

1) Early snoop

2) Home snoop

3) Cluster On Die

In this blog we discuss about Cluster-On-Die (COD) snoop mode in terms of VMware ESXi. This blog covers the following aspects.

  • Basics of COD
  • Pre-requisites to enable COD from both hardware and VMware ESXi point of view
  • Few command-line options in ESXi which shows the difference in NUMA listing with COD enabled and disabled.

Before we get into the details of COD, it is required to understand types of processors based on the core count on Intel Haswell processor microarchitecture.

Intel has classified the Haswell processor architecture into the following types:

1) LCC - Low core count [4-8 core]

2) MCC - Medium core count [10–12 core]

3) HCC - High core count [14-18 core]

             

Note: This core count types varies on different Intel microarchitecture.

 

What is Cluster-On-Die (COD) mode?

COD is a new snoop mode introduced from the Intel Haswell processor family that has 10 or more cores. For the MCC and HCC processor categories, Intel has incorporated two memory controllers on a single processor socket whereas LCC processor has only one memory controller. Each memory controller in a processor socket acts as one home Agent [HA].

On COD-enabled servers, each processor logically splits the socket into two NUMA nodes. Each NUMA node has half of the total number of physical cores and half of the last level cache(LLC) with one home agent. The term cluster is formed as processor cores and the corresponding memory controller are grouped together and formed as cluster on the socket die. Each home agent uses two memory channels and sees requests from less number of processor logical cores thus providing higher memory bandwidth and low latency. This operating mode is used for optimizing the NUMA workloads. The operating systems display the number of NUMA nodes by reading the ACPI SRAT tables.

A graphical representation of COD is as follows:
COD Disabled 
COD Enabled 
 

It can be seen in the second image that the single processor socket die is divided into two logical nodes when COD is enabled. 

Pre-requisites:

In this section, we discuss pre-requisites from both hardware and VMware ESXi point of view.

Hardware:

  • COD can be enabled only on Intel Haswell-EP Processor with 10 or more cores.
    Memory Population
  • Memory must be populated on alternate memory channels (CH0, CH2 & CH1 & CH3). For example the R730, R730xd, R630 & T630 server has four memory channels per socket.  

              Let us take an example to better understand the above pre-requisite. For a server with only two memory modules per channel populated, the following slots must be populated for a specific channel.

  • A1 & A3 

 With four memory module,

  • A1, A3 & B1, B3

 With eight memory module,

  • A1, A3, B1, B3 & A2, A4, B2, B4
NOTE: A minimum of two memory modules must be populated in order to enable COD.
  • The cluster On Die token must be enabled in BIOS settings.

BIOS Settings 
 

  • VMware support for COD started from vSphere 6.0 at the beginning, and now it is supported in ESXi 5.5 U3b as well. See VMware KB 2142499This hyperlink is taking you to a website outside of Dell Technologies. for details. 

How do I check COD status from VMware ESXi?

VMware ESXi reads ACPI System Resource Affinity Tables (SRAT) and System Locality Information Tables (SLIT) to identify and map the hardware resources available. This also includes mapping the NUMA nodes. This section talks about a few command-line options that the users can use to see the COD state from VMware ESXi.

  • Esxtop provides an option to see the NUMA nodes populated. When the esxtop command is entered, press ‘m' to see the NUMA nodes details as follows.

The following screenshots are taken from a system with two processor sockets and 128 GB system memory. In the default configuration without COD enabled, esxtop would display two NUMA nodes with 64 GB allocated per NUMA node. The following figure shows the esxtop command output in VMware ESXi with COD disabled. 
esxtop COD Disabled 

With COD Enabled, esxtop lists four NUMA node instead of two as the single processor socket die is divided into two.

esxtop COD Enabled 
Esxcli provides few command-line option to display the number of NUMA nodes exposed from the hardware.
esxcli

Benefits

In the COD mode, the operating system sees two NUMA nodes per socket. COD has the best local latency. Each home agent sees requests from a lower number of threads potentially offering higher memory bandwidths. COD mode has in-memory directory bit support. This mode is best for highly NUMA optimized workloads. See a blog published by Dell HPC team detailing different snooping modes. 

References

VMware KB calling out Intel COD supportThis hyperlink is taking you to a website outside of Dell Technologies. 

Affected Products

PowerEdge C4130, PowerEdge c6320, PowerEdge c6320p, Poweredge FC430, Poweredge FC630, Poweredge FC830, PowerEdge M630, PowerEdge M630 (for PE VRTX), PowerEdge M830, PowerEdge M830 (for PE VRTX), PowerEdge R230, PowerEdge R330, PowerEdge R430 , PowerEdge R530, PowerEdge R530xd, PowerEdge R630, PowerEdge R730, PowerEdge R730xd, PowerEdge R830, PowerEdge R930, PowerEdge T130, PowerEdge T330, PowerEdge T430, PowerEdge T630, VMware ESXi 6.5.X, VMware ESXi 6.7.X, VMware ESXi 6.x, VMware ESXi 7.x, VMware ESXi 8.x ...
Article Properties
Article Number: 000147278
Article Type: How To
Last Modified: 11 Dec 2024
Version:  8
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.