Kecheng

19 Posts

3620

August 18th, 2013 19:00

How long exactly PP will take to redirect failed IOs and why?

Hi guys,

These days, while I'm support a customer, an issue confuses me a lot is hit as below:

1. A Linux server with a HBA which has 2 x 8G FC ports is zoned to two FAs of a Symmetrix array;

2. Shutdown the FC switch port one of the FC ports attached to while heavy IO traffic is flowing;

3. A IO pause will be triggered, during this time slot, no traffic can be delivered;

4. After a couple of seconds, the IO traffic gets redirected to those active paths.

After an investigation, some experts told me that 30 ~ 45 seconds will be taken normally in such situation before IO gets resume. The reason is PP needs to wait for IO failure return before it redirects IO, and scsi & hba driver have their own mechanism for retry which take times.

After some more investigation, where this 45 is from is clear:

# cat /sys/class/fc_remote_ports/rport-4:0-1/dev_loss_tmo

45

dev_loss_tmo - Specify the number of seconds the scsi layer will wait after a problem has been detected on a FC remote port before removing it from the system. It will be automatically adjusted to the overall retry interval no_path_retry * polling_interval if a number of retries is given with no_path_retry and the overall retry interval is longer than the specified dev_loss_tmo value.

But below questions confuse me:

1. Where is the 30 from? I have found that a bus test will be performed in a 30 seconds base periodically, is that where it is from?

2. In the failover white paper, a failed path(and its shared common setup such as all paths through the same bus) will be marked for test immediately, does it means the test will be triggered right after the marking or we still need to wait for the path test per its periodical setup?

3. Will the path test rely on scsi/hba retry mechanism?

* If that is true, 30 seconds is definitely not enough at all since we still need to wait for the dev_loss_tmo;

* If that is not true, does it means the test will be covered through other active paths by querying related storage array directly? It really confuses me a lot:(

Waiting for a clear explanation......, thanks:)

Responses(9)

Y

Y1Udba1GsO12085

71 Posts

0

August 22nd, 2013 06:00

Hi KC,

The SCSI inquiry command normally has a much lower timeout value than regular IOs, so the inquiry would timeout and fail much faster. Also, PowerPath path tests are scheduled every 10 seconds.

You can find documents about the specific values in the HBA drivers release notes/user guide.

Javier Soriano

PowerPath Corporate Systems Engineer

christopher_ime

2K Posts

0

August 18th, 2013 23:00

Please consider moving this question as-is (no need to recreate) to the proper forum for maximum visibility. Questions written to the users' own "Discussions" space don't get the same amount of attention and questions can go unanswered for a long time.

You can do so by selecting "Move" under ACTIONS along the upper-right. Then search for and select: "PowerPath" which would be the most relevant for this question.

Kecheng

19 Posts

0

August 18th, 2013 23:00

Thanks for your tip. This is the first time I use EMC community:)

Kecheng

19 Posts

0

August 20th, 2013 19:00

Still waiting for input~~~

Butch777

88 Posts

0

August 21st, 2013 04:00

KC - we appreciate your input and are working on a response. Thanks for your patience.

Bob Lonadier

PowerPath Product Manager

Sent from my iPhone

Y

Y1Udba1GsO12085

71 Posts

1

August 21st, 2013 07:00

Hi KC,

PowerPath for Linux sits above the SCSI layer (sd driver) and relies on it to determine when an IO has failed. Depending on the type of failure, the IO can be considered failed immediately or take some seconds to fail.

Normally, if an IO is sent to a target and there is no response, the IO has to wait for the SCSI layer to timeout. Again, that doesn't mean that all failure scenarios would have to wait for the SCSI timeout. This timeout value can vary between different Linux distributions and HBA vendor drivers. EMC approved HBA drivers have the recommended values to work with EMC storage. These values have been tailored to accommodate several scenarios like array failover, firmware upgrades to controllers and alike.

PowerPath uses SCSI inquiry commands to periodically test alive idle paths and paths that have been marked for testing due to an IO failure. SCSI inquiry normally has a lower timeout than other SCSI commands.

Regards

Kecheng

19 Posts

0

August 21st, 2013 18:00

One confusion is that : sine PP sits above the scsi and hba driver, all PP test inquiry will go through them too, won't such inquiry wait for scsi/hba timeout?

Also, could you please share some docs about "EMC approved HBA drivers have the recommended values to work with EMC storage"? I want to have a through understanding of it:)

I really appreciate your help, thanks a ton:)

Kecheng

19 Posts

0

August 21st, 2013 18:00

Hi Robert,

Really thanks for your kindness:)

-KC

Kecheng

19 Posts

0

August 22nd, 2013 18:00

This makes sense:)

Thanks very much for your answer.

View All

No Events found!