Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products

PowerEdge: How to fix Double Faults and Punctures in RAID Arrays

Summary: This article provides information about Double Faults and Punctures in a RAID array and it also advises how to fix the problem.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Instructions

Table of content

  1. Fixing double faults and RAID punctures
  2. Data Errors and Double Faults
  3. Punctures: What Are They and How Are They Caused?
  4. Preventing Problems Before They Happen and Solving Punctures After They Occur
  5. How To videos for creating/ deleting an array or importing/ exporting a foreign configuration

 

 


 

Warning: Following these steps result in the loss of all data on the array, before performing the steps, ensure that all data on the array is backed up and that following these steps does not impact any other arrays.

Fixing double faults and RAID punctures

  1. Discard preserved Cache (if it exists)
  2. Clear foreign configurations (if any)
  3. Delete the array
  4. Check for any failed drives
  5. Reseat any failed drives
  6. Clear any foreign configuration again
  7. Replace all failed drives including predictive failed drives
  8. Update the firmware (Controller, backplane (BP), drives) if needed
  9. Create the array
  10. Perform a Full Initialization (not a Fast Initialization)
  11. At this stage, the array should be ready to be used

Data Errors and Double Faults

RAID arrays are not immune to data errors. RAID controller and hard drive firmware contain functionality to detect and correct many types of data errors before they are written to an array/drive.

  • Data errors can be caused by physical bad blocks, such as a "Head Crash" or degradation of the platter's ability to magnetically store bits in a specific location.
  • A bad block, also known as a bad Logical Block Address (LBA), can also be caused by logical data errors, such as a "bit flip" or incorrect data being written to a drive.
  • Bad LBAs are commonly reported as the Sense Code 3/11/0.
  • Dell hardware-based RAID controllers offer features such as Patrol Read and Check Consistency to correct many data error scenarios.

Perform regular Check Consistency operations will correct for single faults, whether a physical bad block or a logical error of the data.

Check Consistency will also mitigate the risk of a double fault condition in the event of additional errors.

 

Multiple Single Faults in a RAID 5 array - Optimal Array

Figure 1 Multiple Single Faults in a RAID 5 array - Optimal Array

 

Double Fault with a Failed Drive (Data in Stripes 1 and 2 is lost) - Degraded Array.

Figure 2 Double Fault with a Failed Drive (Data in Stripes 1 and 2 is lost) - Degraded Array.

 

Punctured Stripes (Data in Stripes 1 and 2 is lost due to double fault condition)  - Optimal array.

Figure 3 Punctured Stripes (Data in Stripes 1 and 2 is lost due to double fault condition)  - Optimal array.

 

Back to table to content

 

Punctures: What Are They and How Are They Caused?

A puncture is a feature of Dell's PERC controllers designed to allow the controller to restore the redundancy of the array despite the loss of data caused by a double fault condition.

  • A puncture is also known as "rebuild with errors."
  • A puncture can occur in one of two situations: a double fault already exists, or a double fault does not exist.
  • A puncture can occur in three locations: a blank space, a non-critical data space, or a data space that is accessed.
  • Any condition that causes data to be inaccessible in the same stripe on more than one drive is a double fault
  • Double faults cause the loss of all data within the impacted stripe
  • All punctures are double faults but all double faults are NOT punctures

 

Back to table to content

Preventing Problems Before They Happen and Solving Punctures After They Occur

Proactive maintenance can correct existing errors and prevent some errors from occurring.

  • Update drivers and firmware on controllers, hard drives, backplanes, and other devices.
  • Perform routine Check Consistency operations.
  • Review logs for indications of problems.
Note: If the check consistency completes without errors, you can safely assume that the array is now healthy and the puncture is removed. Data can now be restored to the healthy array.
 
Caution: If a known or suspected double fault or puncture condition exists, follow these steps to minimize the risk of more severe problems:
  • Perform a routine  Check Consistency (the array must be optimal)
  • Determine if hardware problems exist
  • Check the controller log
  • Perform hardware diagnostics
  • Contact Dell Technical Support as needed
Note: If these steps have been done, there are additional concerns. Punctures can cause hard drives to go into a predictive failure status over time. Data errors that are propagated to a drive will be reported as media errors on the drive, even though no hardware problems exist.
 
Note: Monitoring the system allows problems to be detected and corrected in a timely manner, which also reduces the risk of more serious problems.

Back to table to content

 


How To videos for creating/ deleting an array or importing/ exporting a foreign configuration

 

How to Create or Delete a Virtual Disk in iDRAC 9

 

Duration: 00:01:53
When available, closed caption (subtitles) language settings can be chosen using the CC icon on this video player.

 

 

How to Import Foreign Configuration for Dell PERC

 

Duration: 00:02:07
When available, closed caption (subtitles) language settings can be chosen using the CC icon on this video player.

 

 

How To Clear Foreign Configuration for Dell PERC

 

Duration: 00:02:02
When available, closed caption (subtitles) language settings can be chosen using the CC icon on this video player.

Back to table to content

Affected Products

OEMR R240, OEMR R250, OEMR R260, OEMR R340, OEMR R350, OEMR XE R350, OEMR R360, OEMR XE R360, OEMR R440, OEMR R450, OEMR R540, OEMR R550, OEMR R640, OEMR XL R640, OEMR R6415, OEMR R650, OEMR R650xs, OEMR R6515, OEMR R6525, OEMR R660, OEMR XL R660 , OEMR R660xs, OEMR R6615, OEMR R6625, OEMR R740, OEMR XL R740, OEMR R740xd, OEMR XL R740xd, OEMR R740xd2, OEMR R7415, OEMR R7425, OEMR R750, OEMR R750xa, OEMR R750xs, OEMR R7515, OEMR R7525, OEMR R760, OEMR R760xa, OEMR R760XD2, OEMR XL R760, OEMR R760xs, OEMR R7615, OEMR R7625, OEMR R840, OEMR R860, OEMR R940, OEMR R940xa, OEMR R960, OEMR T340, OEMR T350, OEMR T360, OEMR T440, OEMR T550, OEMR T560, OEMR T640, OEMR XL T640, OEMR XL R240, OEMR XL R340, OEMR XL R660xs, OEMR XL R6615, OEMR XL R6625, OEMR XL R760xs, OEMR XL R7615, OEMR XL R7625, PowerEdge RAID Controller H345, PowerEdge RAID Controller H355 Front SAS, PowerEdge RAID Controller H355 Adapter SAS, PowerEdge RAID Controller H750 Adapter SAS, PowerEdge RAID Controller H755 Adapter, PowerEdge RAID Controller H755 Front SAS, PowerEdge RAID Controller H965i Adapter, Poweredge C4140, PowerEdge C6400, PowerEdge C6420, PowerEdge C6520, PowerEdge C6525, PowerEdge C6600, PowerEdge C6615, PowerEdge C6620, PowerEdge FC640, PowerEdge HS5610, PowerEdge HS5620, PowerEdge M640, PowerEdge M640 (for PE VRTX), PowerEdge MX5016s, PowerEdge MX7000, PowerEdge MX740C, PowerEdge MX750c, PowerEdge MX760c, PowerEdge MX840C, PowerEdge R240, PowerEdge R250, PowerEdge R260, PowerEdge R340, PowerEdge R350, PowerEdge R360, PowerEdge R440, PowerEdge R450, PowerEdge R540, PowerEdge R550, PowerEdge R640, PowerEdge R6415, PowerEdge R650, PowerEdge R650xs, PowerEdge R6515, PowerEdge R6525, PowerEdge R660, PowerEdge R660xs, PowerEdge R6615, PowerEdge R6625, PowerEdge R670, PowerEdge R740, PowerEdge R740XD, PowerEdge R740XD2, PowerEdge R7415, PowerEdge R7425, PowerEdge R750, PowerEdge R750XA, PowerEdge R750xs, PowerEdge R7515, PowerEdge R7525, PowerEdge R760, PowerEdge R760XA, PowerEdge R760xd2, PowerEdge R760xs, PowerEdge R7615, PowerEdge R7625, PowerEdge R770, PowerEdge R840, PowerEdge R860, PowerEdge R940, PowerEdge R940xa, PowerEdge R960, PowerEdge RAID Controller H330, PowerEdge RAID Controller H730P, PowerEdge RAID Controller H740P, PowerEdge RAID Controller H965e Adapter, PowerEdge T340, PowerEdge T350, PowerEdge T360, PowerEdge T440, PowerEdge T550, PowerEdge T560, PowerEdge T640 ...
Article Properties
Article Number: 000139251
Article Type: How To
Last Modified: 21 Nov 2024
Version:  8
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.