Start a Conversation

Unsolved

This post is more than 5 years old

4108

April 28th, 2015 21:00

A Quick Troubleshooting Process of VPLEX Performance Issue

A Quick Troubleshooting Process of VPLEX Performance Issue

Share: image001.png

Please click here for all contents shared by us.

Introduction

      EMC VPLEX is a virtual storage software product introduced by EMC in 2010. In some cases the customers complained that the VPLEX’s performance degrades even there is no hardware failure. In such situation we usually ask more detailed questions and require a lot of information from the customers.


      There are many items need to be checked: back-end VNX performance logs, back-end Symmetrix STP performance data, back-end SAN switch logs, front-end SAN switch logs, hosts logs and performance data (IO response time, IO size, file system type, workload type and read/write ratio, host CPU/memory utilization, host MPIO software and the link connectivity between two Metro sites).


      Actually in most cases the problem is not complicated. In this article we’ll introduce a simple process of checking the performance issue of VPLEX and the basic performance monitoring feature of it.


Detailed Information

      First we can run the general checking program VplexPlatformHealthCheck to ensure that there is no failure on VPLEX:

The output is like below:


service@ManagementServer:~> VPlexPlatformHealthCheck

System Information

------------------

     single engine(small config)system detected

Management Server IP Connectivity Check

---------------------------------------

Port Plugged : OK

IP interfaces : OK

IP Connectivity to Directors Check : OK

Local-com FC Connectivity Check

-------------------------------

Director to Director Connectivity Check : OK

Management Server System Check

------------------------------

Process Check : OK

Check Partitions : OK

CPU Check : OK

BMC Check : OK

Director (engine-1-1 director 1A 128.221.252.35) Health Check

-------------------------------------------------------------

    Process Check: OK

    CPU Check: OK

    SSD Check: OK

    Partition Check: OK

    RPM Check: OK

    flashDir Check: OK

    WWN Seed Check: OK

    Health Check: OK

    Hardware Module Check: OK

Director (engine-1-1 director 1B 128.221.252.36) Health Check

-------------------------------------------------------------

    Process Check: OK

    CPU Check: OK

    SSD Check: OK

    Partition Check: OK

    RPM Check: OK

    flashDir Check: OK

    WWN Seed Check: OK

    Health Check: OK

    Hardware Module Check: OK

      The second step is checking the Monitoring page in VPLEX web interface. In this page we can check the CPU utilization:


      image003.jpg

     

      Front-end Latency:


      image005.jpg


      Front-end Throughput:


      image007.jpg

      Back-end Latency:

      image009.jpg

      Back-end Throughput:

      image011.jpg

     

      Check if front-end and back-end’s IO latency is consistent. This can help us quickly identify where the problem is.


      image013.jpg

     

      Check if all the front-end and back-end ports delays are consistent. If using VPLEX Metro deployment, also check the WAN port traffic and the delay time.


      image015.jpg
      image017.jpg

     


      Here is a real case. The customer’s each client took almost three minutes to finish the operation, but normally it should take only three seconds. After the investigation, the front-end ports were all normal. But there was a high packet drop rate on one back-end port. Then we found many CRC errors and C3 discards in switch logs. After replacing the OM3 cable, the issue was solved. We also solved another two cases by this method. In conclusion, checking the Monitoring page in VPLEX web interface, monitoring all ports traffic and latency can quickly identify where the problem is.





Translator: Roger





No Responses!
No Events found!

Top