Unsolved
4 Posts
0
3288
Cluster not in sync
Hi Guys,
We had an unplanned power outage last day during a rack maintenance and having issue since then
#Show-Clusters
Cluster-Name Index State Gates-Open Conn-State
CLU01 1 unknown True connected
#Show-Bricks
Brick-Name Index Cluster-Name Index State
X1 1 CLU01 1 not_in_sys
XMS GUI -> Inventory, X-Bricks showing as "Not in cluster"
All the checks via Controller CLI (xinstall) are passed
1. Check DAE controllers connectivity
2. Check IB switches connectivity
3. Check dedicated IPMI connectivity
4. Check BBU connectivity
5. Check PSU input
Any advise?
DELL-Josh Cr
Moderator
Moderator
•
8.6K Posts
0
October 5th, 2021 08:00
Hi,
Are you able to reboot the devices and see if it can reconnect? This article may also help. https://dell.to/3B8QLE5
Let us know if you have any further questions.
DELL-Sam L
Moderator
Moderator
•
7K Posts
0
October 5th, 2021 16:00
Hello R.Hari,
You are going to need to open a support case to resolve this issue.
R.Hari
4 Posts
0
October 5th, 2021 16:00
Thanks . All cluster operations are returning error "invalid_in_cur_sys_state". So I did try emergency shutdown and start up procedure. Did not help.
Storage-Controller-Name Index Cluster-Name Index Mgr-Addr-Subnet MGMT-GW-IP
X1-SC1 1 CLU01 1 1XX.XX.XX.231/27 172.XX.XX.XX
X1-SC2 2 CLU01 1 1XX.XX.XX.232/27 172.XX.XX.XX
xmcli >
xmcli >
xmcli > test-xms-storage-controller-connectivity sc-id=1
64 bytes from 1XX.XX.XX.231: icmp_seq=1 ttl=64 time=0.092 ms
64 bytes from 1XX.XX.XX.231: icmp_seq=2 ttl=64 time=0.081 ms
64 bytes from 1XX.XX.XX.231: icmp_seq=3 ttl=64 time=0.099 ms
64 bytes from 1XX.XX.XX.231: icmp_seq=4 ttl=64 time=0.108 ms
4 packets transmitted, 4 received, 0% packet loss, time 3000ms
xmcli > test-xms-storage-controller-connectivity sc-id=2
64 bytes from 1XX.XX.XX.232: icmp_seq=1 ttl=64 time=0.114 ms
64 bytes from 1XX.XX.XX.232: icmp_seq=2 ttl=64 time=0.106 ms
64 bytes from 1XX.XX.XX.232: icmp_seq=3 ttl=64 time=0.113 ms
64 bytes from 1XX.XX.XX.232: icmp_seq=4 ttl=64 time=0.108 ms
4 packets transmitted, 4 received, 0% packet loss, time 3000ms
R.Hari
4 Posts
0
October 5th, 2021 17:00
Thanks. Will do. One thing I wanted to try was to remove and re-add cluster to XMS to see if it fix cluster status issue
#Show-Clusters
Cluster-Name Index State Gates-Open Conn-State
CLU01 1 unknown True connected
remove-cluster and add-cluster operations wont wipe data right? I read it wont, but just wanted to be 100% sure.
DELL-Sam L
Moderator
Moderator
•
7K Posts
0
October 6th, 2021 08:00
Hello R.Hari,
As long as you remove the cluster as stated you will not lose any data.
R.Hari
4 Posts
0
October 6th, 2021 22:00
It was removed from cluster and now wont let me add back. I am getting below error.
xmcli (admin)> add-cluster sc-mgr-host="1XX.XX.XX.232" force
09:04:11 - Collecting cluster information
*** XMS Completion Code: Method 'sym.SystemQueryAllObjs.__str__' not defined
Also I looked up Dell EMC portal with Serial number. This device is not under support.
It was an X Prod unit we re-purposed for Lab. With that I am not sure if it can be recovered / re-used.
DELL-Sam L
Moderator
Moderator
•
7K Posts
0
October 7th, 2021 16:00
Hello R.Hari,
To resolve that error there are some commands that support has to run. I understand that your system is out of support, but the commands to resolve this is limited to support only.
R-HARi
1 Message
0
October 31st, 2021 21:00
Hi Sam,
Thanks and I understand.
But as you know, there are almost 0 documents / articles out for for for systems like XIO. All that one could find is the user guide in Dell EMC portal. While I am sure Dell EMC cant help, I decided post it here in the community hoping someone previously experienced the issue could advise.
I spent some time over the last two week on it, as it's still the heart of our VMware lab and we wouldn't want to just throw it away. No luck yet. Let me know if anyone got any ideas. Cheers.
gvaidman
20 Posts
0
March 22nd, 2023 11:00
FYI, we got the same exact error trying to import a previously running XIO cluster into an XMS. The issue turned out to be that not all components (e.g., DAEs, controllers, back-end IB switches) were powered on. Once we got everything plugged in/powered on, the add-cluster command succeeded.
Our cluster had been removed from the XMS and powered down in preparation for decommissioning, but it was decided they wanted to repurpose it. So not exact use case as above, but similar (with the addition of it being powered off).
Just wanted to leave this here in case someone runs into this same error message.