Symptoms
Oracle RAC database is experiencing inordinately high CPU usage load averages on the servers which results in the sessions going into a D-state, I/O latency against the database. Due to this, the applications must be switched to the other RAC node members and the affected node rebooted. Upon reboot and being joined back into the RAC cluster, the issue persisted.
Cause
Per Brocade engineering analysis, the failures encountered were due to some marginal SFPs on the switches that had low power levels. Brocade has the addressed and accounted for this condition in the Brocade FOS 6.10 release, which introduces a new port fencing feature, that correctly handles ports that are running at low RX or TX power so they do not stall I/O.
Resolution
Through the implementation of port fencing available in the Brocade FOS 6.10 release, the switch monitors for specific behaviors on a port and protects a switch by fencing the port when a given threshold is exceeded.
For specific details on this, see the following Brocade SAN Fabric Administration Best Practices Guide at:
https://docs.broadcom.com/doc/12379730 [docs.broadcom.com].