Start a Conversation

Unsolved

This post is more than 5 years old

N

1139

October 31st, 2013 05:00

Centera connection timeout


Hello,

We have one centera cluster with 4 nodes. Two of them have Access role (c001n01 and c001n02). The other day, we have a disk replacement on c001n01. During disk replacement, after the node become online it wasn't accessible during some time (maybe it still was configuring the new disk???? ). Our application had problems accesing the cluster (in the connecting string we have both IP of access nodes), we had very longs timeouts.

To workaround the problem, I had to add a firewall rule (Linux iptables) in our application server that rejects any communication with node c001n01. With this rule, ore applicacion could acces to c001n02 without any problem.

is this normal behavior? In the SDK I can specify connection timeout, but I can only configure it after the pool is open (is not a global option).

337 Posts

December 13th, 2013 03:00

Even when you have two access nodes, one will be primary, in this case c001n01 could have been the primary. You may use the Centera viewer to check if there are any Regens running in the background, on the application side you can check for configuration settings for primary and secondary node, etc. If everything is fine then there is no issue.

208 Posts

December 13th, 2013 06:00

Hello Nacho -

The timeout is normal, but can be adjusted outside the code by setting the FP_OPTION_TIMEOUT environment variable - check the SDK guide for an explanation.

It sounds like you have knowledge or ownership of this code so I will mention that this sounds like an application design problem to me. A well-behaved Centera integration only calls FPPool_Open() once and then uses that pool connection instance throughout the lifecycle of the application. An application written in this manner would still face a 'node down' connection penalty, but only once during application startup; after the initial 2 minute delay the application would function mostly normally (there would still be short periodic interruptions while the SDK probed to see if the connection-string node that was not found during application startup had returned to service).


Best of Luck,

Mike Horgan

No Events found!

Top