Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products

PowerStore: During Initial Configuration Wizard installation (ICW), or when adding an appliance to an existing cluster, the task fails with "Unconfigured Faulted".

Summary: During Initial Configuration Wizard installation (ICW), or when adding an appliance to an existing cluster, the task fails with "Unconfigured Faulted"

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Issue

During initial installation of an appliance to a new Cluster, or when adding a new appliance to an existing Cluster, there are hardware and network checks that can fail, which may change the status of the appliance from "Unconfigured" to "Unconfigured Faulted". This state means that the appliance cannot join a new cluster, or be added to an existing cluster.  The fault must be cleared before the appliance can be added to a cluster.



Cause

 

Resolution


Cause

In order to determine if there is a hardware or network issue that is causing the "Unconfigured Faulted" condition, perform the following steps:

If you are adding the appliance to an existing Cluster (from the PowerStore Manager):

  1. Access the system using the Service LAN port access method (See PowerStore: Accessing a Node for details).
  2. Once logged in to the Service container as the service user, run the following Service script to determine whether there is a hardware or network issue:

    svc_diag list --icw_hardware
    svc_diag list --network
     
  3. If no errors or issues are reported, it is possible that the original symptoms seen during the ICW or add appliance operation were transient.
     
  4. If an error or issue is indicated, and you cannot determine how to resolve, run the following script to produce a Data Collect, and contact your Service Provider for assistance:

    svc_dc run
     
  5. If you are installing an appliance to a new cluster and ready to retry installation:
    • Close your Discovery Tool and/or browser. 
    • Relaunch the Discovery Tool or open the browser with the static Service LAN IP address for Node A [i.e. 128.221.1.252].
    • If the system displays that it is in an 'Unconfigured' state, resume your initial configuration/ICW steps
    • Retry the Add Appliance operation to see if the task succeeds.
       
  6. In the event the task still fails, perform a Data Collection to obtain relevant logs, and contact your Service Provider for assistance.

The following is an example of successful output from " svc_diag list --icw_hardware " in the left-hand column; the right-hand column contains explanations of errors you may see.

Sample Output

Description

hw_type Warnado-EX
Running on Node A

 

Node A FRU Status

OK | Peer Node | 0x0f80
OK | Local Node | 0x1480
OK | Embedded Module | 0x8b81
OK | 4-Port Card | 0x8b81
OK | I/O Module 0 | 0x8b81
OK | I/O Module 1 | 0x8b81
OK | Internal Backup Battery Module | 0x3380

Node B FRU Status
OK | Peer Node | 0x0f80
OK | Local Node | 0x1580
OK | Embedded Module | 0x8b81
OK | 4-Port Card | 0x8b81
OK | I/O Module 0 | 0x8b81
OK | I/O Module 1 | 0x8b81
OK | Internal Backup Battery Module | 0x1380

These tables consist of three columns:

 

Summary | FRU Name | Status Sensor value

 

The Summary column on the left should be read as follows:

 

OK = FRU status is good

 

Empty = FRU is missing and/or not detected by the appliance. Since I/O Modules are optional, it can be normal to see "Empty" status for I/O Modules. (In that case, the same I/O Module slot(s) must be Empty on both nodes). All other FRUs are required hardware and should always be "OK".

 

Off = FRU is powered off. FRU may need to be replaced.

 

Unknown = Status Sensor value contains unexpected values. FRU may need to be replaced.

 

Recommended Action for failure: Consult related KB articles for details on how to resolve these hardware issues. These include: SLN317238/SLN320677 (Nodes), SLN317221 (I/O Modules, 4-Port Card), and SLN320676 (Embedded Module).

IO Module Consistency Check = Success

Node Consistency Check = Success

Battery Check = OK

These checks compare the FRU Status Summary values from each node. Both nodes are expected to report the same Summary value for each FRU.

Recommended Action for failure :

  1. For I/O Modules, consult KB article SLN317221.
  2. For Nodes, consult KB articles SLN317238/SLN320677.
  3. The Internal Backup Battery Module check will always be OK unless FRU status cannot be read from one or both nodes. If this is the only failure reported by the icw_hardware command, restarting the ICW should allow it to pass.

Node A Fault Status Register Status = Success
OK | Node
OK | Embedded Module
OK | Internal Backup Battery Module
Module
        OK | DIMM00
        OK | DIMM01
        OK | DIMM02
        OK | DIMM03
        OK | DIMM04
        OK | DIMM05
        OK | DIMM06
        OK | DIMM07
        OK | DIMM08
        OK | DIMM09
        OK | DIMM10
        OK | DIMM11
        OK | DIMM12
        OK | DIMM13
        OK | DIMM14
        OK | DIMM15
        OK | DIMM16
        OK | DIMM17
        OK | DIMM18
        OK | DIMM19
        OK | DIMM20
        OK | DIMM21
        OK | DIMM22
        OK | DIMM23
OK | I/O Module 0
OK | I/O Module 1
OK | 4-Port Card

Node B Fault Status Register Status = Success
OK | Node
OK | Embedded Module
OK | Internal Backup Battery Module
Module
        OK | DIMM00
        OK | DIMM01
        OK | DIMM02
        OK | DIMM03
        OK | DIMM04
        OK | DIMM05
        OK | DIMM06
        OK | DIMM07
        OK | DIMM08
        OK | DIMM09
        OK | DIMM10
        OK | DIMM11
        OK | DIMM12
        OK | DIMM13
        OK | DIMM14
        OK | DIMM15
        OK | DIMM16
        OK | DIMM17
        OK | DIMM18
        OK | DIMM19
        OK | DIMM20
        OK | DIMM21
        OK | DIMM22
        OK | DIMM23
OK | I/O Module 0
OK | I/O Module 1
OK | 4-Port Card

The status values in the left-hand column will be OK or FLT. These are read from the Fault Status Register (FSR).

 

A "FLT" indicates that the FRU has taken a hardware error.

 

An "OK" means that there is no hardware error recorded for that FRU. If a FRU is not present, the status in this table should be "OK". (An empty I/O Module slot would be shown as "OK" in these tables but would be listed as "Empty" in the FRU Status table above.)

 

Recommended Action for failure:  Search for related knowledgebase articles for resolution to hardware issues. These include: SLN317238/SLN320677 (Nodes), SLN317213 (Internal Backup Battery Module), SLN317221 (I/O Modules, 4-Port Card), and SLN320676 (Embedded Module).

NVRAM Cache Drives
Node Core Counts (NodeA:12, NodeB:12)
Number of NVRAM Drives Required based on Core Count: 2
NVRAM Drives Found (NodeA:2, NodeB:2)
NVMe Storage Drives
Number of NVMe Drives Required: 6
SCM Drives Found (NodeA: 0, NodeB: 0)
SSD Drives Found (NodeA: 12 (NVMe 6, SAS 6), NodeB: 12 (NVMe 6, SAS 6))
NVMe Drive Check = Success
compareNodeDrives - NVEe Drive Counts, NodeA 8, NodeB 8
compareNodeDrives - Both Nodes see same NVMe drives
compareNodeDrives - SAS SSD Drive Counts, NodeA 12, NodeB 12
compareNodeDrives - Both Nodes see same drives
Compare Node Drive Check = Success
checkExpansionEnclosures - nodeAEnclCount 2, nodeBEnclCount 2
Enclosure Check = Success

Drive-related checks include:
1.    The appliance must contain the correct number of NVRAM Cache drives (the specific number depends on the model of the appliance). Recommended Action if this shows a failure: Look for missing, faulted, or improperly seated NVRAM drives. The output of the "svc_diag list --nvme_drive" command may be helpful.

2.    The data drives in the system must follow the official configuration rules for SCM, SCD, and SAS drives (in this example, there are no SCM drives in the appliance). Recommended Action if this shows a failure: Check the drive labels of all of the NVMe and/or SAS drives. If there are a mixture of SCM and SSD drive types, replace or remove drives as necessary.

3.    The same number of drives must be visible from both nodes (a drive which is visible from only one node will cause problems). Recommended Action if this shows a failure: You can use "svc_diag list --nvme_drive" to display detailed status about NVMe drives to identify which drive or drives are visible only on one node.

4.    The same number of drive enclosures must be visible from both nodes (an enclosure which is visible from only one node will cause problems). Recommended Action if this shows a failure: Check all enclosure cables and verify that the enclosures are properly cabled.

checkIoms - nodeAIoms: [u' 303-321-000C', u' 313-202-000B']
checkIoms - nodeBIoms: [u' 303-321-000C', u' 313-202-000B']
Compare Node IOM Check = Success

The I/O Module in each slot on one node must match the I/O Module in the same slot on the peer node.

 

This can happen if the wrong kind of I/O Module is present in one node, or if each node contains one I/O Module but they are in different slots (example: slot 0 on one node, but slot 1 on the peer node). You can also see a failure here if an I/O Module is missing or powered off (see "Fault Status Register" section above).

 

Recommended Action for failure: Compare the part numbers of the I/O Modules in both I/O Module slots on both nodes. If there are any inconsistencies, move or replace I/O Modules as needed to correct the problem. KB SLN317221 may also be helpful.

OVERALL STATUS: True, return_code 0
IOM Consistency Check : Success
Node Consistency Check : Success
Battery Check : OK
Fault Status Register A : Success
Fault Status Register B : Success
Node A Accessible : True
Node B Accessible : True
Drive Check : Success
Node Drives Compare Check : Success
Enclosure Check : Success
IO Module Compare Check : Success

This section is a summary of the information provided above.


The following is an example of successful output from " svc_diag_list --network" :

Sample Output

***** Start minimal cabling check *****
OCP_MEZZ 0 is LINK_STATUS_UP on Node A
OCP_MEZZ 0 is LINK_STATUS_UP on Node B
OCP_MEZZ 1 is LINK_STATUS_UP on Node A
OCP_MEZZ 1 is LINK_STATUS_UP on Node B
***** Minimal cabling check: Overall errors: 0
Overall errors: 0, return code: 0




 


 

Affected Products

PowerStore
Article Properties
Article Number: 000139935
Article Type: Solution
Last Modified: 11 Aug 2021
Version:  5
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.