1 Rookie
•
2 Posts
0
36
Detection of Management Unit Failure in a Stack + Non-standard Scenarios
Hello,
I have a couple of questions regarding two SW N1124T-ON units in a stack, connected via SFP. Despite searching the manual, I couldn't find a definitive answer.
How does a stack member determine if the management unit is truly dead? Specifically, I'm concerned about scenarios where the SFP connections are lost, either by physical disconnection or damage, but they remain in the network. In such a situation, if the slave switch assumes the role of the master, including the stack IP address, there will be two devices with the same address in the network. Although STP would intervene, this doesn't directly address the query.
Additionally, in the case of overload on the main switch leading to port cycling, it's possible that hello messages still flow, indicating to failover switches that the main switch is operational, but they may not be aware of its faulty state. I understand there are safeguards in place, but there seems to be none against a faulty switch.I understand that switch configurations synchronize based on the master. When switches disconnect, the secondary becomes the primary, config changes are made on it, and then the original master reconnects. However, won't the config change made during failover be lost in such a scenario?
In general, do you have experience with other, less probable, "catastrophic" scenarios?
Thank you.
DELL-Charles R
Moderator
Moderator
•
3.7K Posts
1
April 8th, 2024 17:32
Hello,
These are 2 different design decisions and will have to pick one or the other because it matters how you setup the devices that connect to the switches.
--If you have a stack – then you can have port-channels connecting to the end devices where one link can be plugged into the master and the other into the stack member. That ensures redundancy at link level and switch level. The stack is considered one logical entity with more ports.
--If you have 2 separate switches (that connect with 1Gb internet), you cannot have a port-channel to an end device where one link is plugged into one switch and the other link into the other switch. So you lose the redundancy and high availability benefit and the ability to manage as one. A port-channel has to go to only one switch.
DELL-Charles R
Moderator
Moderator
•
3.7K Posts
0
March 8th, 2024 21:45
Hello,
The scenarios described above are covered by the different redundancy features implemented in the Dell N-series software. There are different design mechanisms to monitor if a component is down or failed, heartbeat/hello messages is one common mechanism, like you have mentioned. If the user has ensured that all components are redundant, we are not aware of any remaining scenarios. Here are some of the Redundancy features:
TruemanHTS
1 Rookie
1 Rookie
•
2 Posts
0
April 8th, 2024 14:59
Thanks for the response, I'd like to clarify something further.
For simplicity, let's say someone physically disconnected the SFPs that were connecting the switches into a stack. At this moment, the stack member does not see the master, so it becomes the master itself. Nothing changes for the original master. In the network, there are thus two "identical devices". The heartbeat has no way to pass through. Could a solution then be to connect the switches with a 1GB Ethernet so they can continue to communicate?