RecoverPoint & RecoverPoint for Virtual Machines: Consistency Groups Swapping Between RPAs Due to Replication Process Crashes or Reboot Regulation

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Check out other resources

Symptoms

RecoverPoint and RecoverPoint for Virtual Machines Consistency Groups, (CGs,) can enter into a state where they swap between primary RecoverPoint Appliances, (RPAs,) because of numerous replication process crashes. If the crashes are numerous enough, the RPAs can enter reboot regulation and detach from the RecoverPoint cluster, resulting in an RPA showing as down within the GUI.

Within the RPA replication logs, the DistributorPhase1 process will have a status of low credit (without enough available memory), causing an assertion to be witnessed:

201X/XX/XX 05:46:00.708 - #2 - 27302/26963 - MemoryManager: viscus on assert ... >> 1158731105004683264 :phase1#1 (groupTaskID=(sessionID=1407996198,replicationLinkID= (kVolSlot=XXXXXXXXX,srcCopyID=GlobalCopy(SiteUID(0xXXXXXX) 0) ,destCopyID=GlobalCopy(SiteUID(0xXXXXXXXXXXXXXXXX) 0) )),gridCopyID=0) using 0 credit 12463 min 512 max 13056 counter 269585 bound 282048 overld 275816 reachBound 0 standalone ... 201X/XX/XX 05:46:00.713 - #2 - 27091/26963 - RemoteLogSender: got event (uniqueId=0, eventTime=1555998360713692), EventID_KBOX_ASSERTION_FAILED(3031), SiteUID(0xXXXXXXXXXXXXXXX), seDetails=Sender=replication, Topic=DistributorGroupHandler, msg=Assertion failed: isPhase1CacheMemorySufficient(m_phase1SubConsumer) Line XXXX File DistributorGroupHandlerPhase1.cc PID: XXXXX Info: regular phase1 cache memory not sufficient

Cause

When I/O is coming in at a high rate to the replica copy, (for example, during an initialization with extremely fast primary storage,) the Distributor's Phase1 memory allocation, used for moving I/O between the RPAs and the journal, can reach 100%. At the same time, there are additional I/O requests waiting in the queue to be processed. This can cause a RACE condition between freeing the utilized memory and requesting memory for the queued requests. When this occurs, the RPA's replication process can crash. During a first time initialization of a CG, this can lead to reboot regulation, as after every process crash, the same I/O rate will commence once again.

Resolution

Workaround:
1. Enable I/O Throttling to either Low or High on the Array(s) in question to limit how fast RecoverPoint will read I/O off the Production Array(s).
2. Attempt to initialize CGs sequentially, only attempting one or two at maximum to limit reading off the Production Array(s).

Resolution:
This issue is addressed in the RecoverPoint for Virtual Machines 5.1 and higher.
This issue is not addressed in RecoverPoint Classic. Dell EMC Engineering is currently investigating this issue. A permanent fix is still in progress. Contact the Dell EMC Customer Support Center or your service representative for assistance and reference this solution ID.

Affected Products

RecoverPoint

Products

RecoverPoint, RecoverPoint for Virtual Machines

Article Number: 000168745

Article Type: Solution

Last Modified: 20 Nov 2020

Version: 2

Check if your device is covered by Support Services.

RecoverPoint & RecoverPoint for Virtual Machines: Consistency Groups Swapping Between RPAs Due to Replication Process Crashes or Reboot Regulation

Symptoms

Cause

Resolution

Affected Products

Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services

Welcome

Welcome to Dell

RecoverPoint & RecoverPoint for Virtual Machines: Consistency Groups Swapping Between RPAs Due to Replication Process Crashes or Reboot Regulation

Detailed Article

Symptoms

Cause

Resolution

Affected Products

Symptoms

Cause

Resolution

Affected Products

Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services