During initial installation of an appliance to a new Cluster, or when adding a new appliance to an existing Cluster, there are hardware and network checks that can fail, which may change the status of the appliance from "Unconfigured" to "Unconfigured Faulted". This state means that the appliance cannot join a new cluster, or be added to an existing cluster. The fault must be cleared before the appliance can be added to a cluster.
In order to determine if there is a hardware or network issue that is causing the "Unconfigured Faulted" condition, perform the following steps:
If you are adding the appliance to an existing Cluster (from the PowerStore Manager):
svc_diag list --icw_hardware
svc_diag list --network
svc_dc run
The following is an example of successful output from " svc_diag list --icw_hardware
" in the left-hand column; the right-hand column contains explanations of errors you may see.
Sample Output |
Description |
hw_type Warnado-EX |
|
Node A FRU Status OK | Peer Node | 0x0f80 Node B FRU Status |
These tables consist of three columns:
Summary | FRU Name | Status Sensor value
The Summary column on the left should be read as follows:
OK = FRU status is good
Empty = FRU is missing and/or not detected by the appliance. Since I/O Modules are optional, it can be normal to see "Empty" status for I/O Modules. (In that case, the same I/O Module slot(s) must be Empty on both nodes). All other FRUs are required hardware and should always be "OK".
Off = FRU is powered off. FRU may need to be replaced.
Unknown = Status Sensor value contains unexpected values. FRU may need to be replaced.
Recommended Action for failure: Consult related KB articles for details on how to resolve these hardware issues. These include: SLN317238/SLN320677 (Nodes), SLN317221 (I/O Modules, 4-Port Card), and SLN320676 (Embedded Module). |
IO Module Consistency Check = Success Node Consistency Check = Success Battery Check = OK |
These checks compare the FRU Status Summary values from each node. Both nodes are expected to report the same Summary value for each FRU. Recommended Action for failure :
|
Node A Fault Status Register Status = Success Node B Fault Status Register Status = Success |
The status values in the left-hand column will be OK or FLT. These are read from the Fault Status Register (FSR).
A "FLT" indicates that the FRU has taken a hardware error.
An "OK" means that there is no hardware error recorded for that FRU. If a FRU is not present, the status in this table should be "OK". (An empty I/O Module slot would be shown as "OK" in these tables but would be listed as "Empty" in the FRU Status table above.)
Recommended Action for failure: Search for related knowledgebase articles for resolution to hardware issues. These include: SLN317238/SLN320677 (Nodes), SLN317213 (Internal Backup Battery Module), SLN317221 (I/O Modules, 4-Port Card), and SLN320676 (Embedded Module). |
NVRAM Cache Drives |
Drive-related checks include: |
checkIoms - nodeAIoms: [u' 303-321-000C', u' 313-202-000B'] |
The I/O Module in each slot on one node must match the I/O Module in the same slot on the peer node.
This can happen if the wrong kind of I/O Module is present in one node, or if each node contains one I/O Module but they are in different slots (example: slot 0 on one node, but slot 1 on the peer node). You can also see a failure here if an I/O Module is missing or powered off (see "Fault Status Register" section above).
Recommended Action for failure: Compare the part numbers of the I/O Modules in both I/O Module slots on both nodes. If there are any inconsistencies, move or replace I/O Modules as needed to correct the problem. KB SLN317221 may also be helpful. |
OVERALL STATUS: True, return_code 0 |
This section is a summary of the information provided above. |
The following is an example of successful output from " svc_diag_list --network"
:
Sample Output |
***** Start minimal cabling check ***** OCP_MEZZ 0 is LINK_STATUS_UP on Node A OCP_MEZZ 0 is LINK_STATUS_UP on Node B OCP_MEZZ 1 is LINK_STATUS_UP on Node A OCP_MEZZ 1 is LINK_STATUS_UP on Node B ***** Minimal cabling check: Overall errors: 0 Overall errors: 0, return code: 0 |