Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products

PowerStore: Mapping an NVMeoF volume may lead to service disruption on multi-appliance clusters

Summary: Mapping NVMeoF volumes on a multi-appliance cluster may lead to service disruption for the appliance on which the volume is created.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Mapping NVMeoF volumes on a multi-appliance cluster may lead to service disruption for the appliance on which the volume is created. This may occur only in appliance#2 and above. This does not occur on the first appliance.

 

Environment:

  • Multi-appliance cluster
  • Hosts are connected over NVMe/FC or NVMe/TCP.
  • There were either (a) multiple add appliance failures or (b) multiple remove appliances performed.

 

Symptoms:

  • The node may unexpectedly reboot.
  • If both nodes reboot, a service disruption may occur.

 

Cause

  • On NVMeoF (NVMe/FC or NVMe/TCP), a basic mechanism exists to support asymmetric namespace access (ANA)
    ANA occurs on appliances where volume access characteristics may be different between NVMe controllers.
    Example: Volume-1 on Node-A may be optimized while Volume-1 on Node-B is non-optimized.
  • The concept is similar to ALUA with Target Port Group (TPG):
    Each node is assigned a unique TPG ID to distinguish between the states of each node (which is optimized and which is non-optimized)
  • With NVMe-oF on PowerStore, each appliance has several ANA groups:
    • ANA Group #1 - Used for volume migration between appliances (the group ID is 1 across the cluster)
    • ANA Group #X - Used to describe volumes where Node-A is optimized and Node-B is non-optimized
    • ANA Group #Y - Used to describe volumes where Node-A is non-optimized and Node-B is optimized
    • ANA Group #Z (Future Use) - Used to describe volumes where Node-A and Node-B are optimized (Active/Active)
  • When adding an appliance, Control-Path uses a special sequence number to determine the target port group id to create.
    This sequence only increments, even when the added appliance fails. The sequence can be quite large if the added appliance fails several times.
  • Due to a software issue, there is a limit on the maximum ANA Group ID, while Control-Path has no limit.
  • When mapping a volume to an NVMe host, the volume is classified to the correct ANA group; the ANA group is derived from the TPG ID for the Node who owns the volume.
  • The mapping operation may lead to a software module failure that may lead to a node reboot

 

Resolution

This issue is fixed in PowerStoreOS 4.0.0.

 

Workaround

  • Escalate to Global Services for assistance and after recovery, plan to upgrade to PowerStoreOS 4.0.0. See this KB article for expedited attention.

 

Affected Products

PowerStore
Article Properties
Article Number: 000216639
Article Type: Solution
Last Modified: 28 May 2024
Version:  3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.